[HN Gopher] Bing AI can't be trusted
       ___________________________________________________________________
        
       Bing AI can't be trusted
        
       Author : dbrereton
       Score  : 725 points
       Date   : 2023-02-13 16:40 UTC (6 hours ago)
        
 (HTM) web link (dkb.blog)
 (TXT) w3m dump (dkb.blog)
        
       | wefarrell wrote:
       | The amount of trust people are willing to place in AI is far more
       | terrifying than the capabilities of these AI systems. People are
       | too willing to give up their responsibility of critical thought
       | to some kind of omnipotent messiah figure.
        
         | dagw wrote:
         | _The amount of trust people are willing to place in AI is far
         | more terrifying than the capabilities of these AI systems._
         | 
         | I don't know, people seem to quickly adjust their expectations
         | to reality. Listening the conversation around ChatGPT that I'm
         | hearing around me, people have become a lot more sceptical over
         | just the last couple of weeks, as they've gotten a chance to
         | use the system hands on rather than just read articles about
         | it.
        
           | kahnclusions wrote:
           | I've tried playing around with it quite a bit for the "choose
           | your own adventure" type games.
           | 
           | It's really good at generating text and following prompts.
           | Letting later responses use the previous prompts and
           | responses as input really gives the illusion of a
           | conversation.
           | 
           | But it's extremely limited... you run up against the limits
           | of what it really can do very quickly.
        
           | kenjackson wrote:
           | I've gone just the opposite. The ability for me to ask my own
           | questions has impressed me more than the articles. In part
           | because articles always hype up technology in a crazy way.
           | This is the first technology since -- probably the Apple
           | iPhone touchscreen where I'm like, "this is better than they
           | hype seemed to convey".
           | 
           | I think the goalposts have moved greatly though. Just a
           | couple of years ago, most lay tech folks would've laughed you
           | out the room if you suggested this could be done. The
           | summarized answers by Bing and Google are relics
           | comparitatively.
        
         | throw8383833jj wrote:
         | well, people already do that with their news feed.
        
           | commandlinefan wrote:
           | And before social media news feeds, people were doing that
           | with newspapers for generations. Those people have always
           | been around.
        
             | [deleted]
        
         | jackmott42 wrote:
         | Which person or persons specifically are you referring to?
        
           | noobermin wrote:
           | Every other day, a poster comments how they make
           | presentations or write code using chatgpt. Just in another
           | thread, someone posted how chatgpt solved their coding
           | problem...which a quick google search would have solved as
           | well, as others in the replies to it pointed out.
           | 
           | Whenever I've used chatgpt I was impressed at the surface
           | level, but digging deeper into a convo always turned up
           | circular HS tier BS'ing. The fact that so many people online
           | and on HN are saying chatgpt is astounding and revolutionary
           | are just betraying that that such HS-essay level BS is
           | convincing to them, and it's somewhat of a depressing though
           | that so many people are so easy striken by a confidence
           | trick.
        
             | rockinghigh wrote:
             | I have been using ChatGPT successfully for coding tasks for
             | the last two months. It's often faster than a Google search
             | because it delivers the code, can explain it, and can write
             | a unit test.
        
             | pixl97 wrote:
             | >which a quick google search would have solved as well,
             | 
             | You mean 'may' have solved that. Google is becoming a
             | bullshit generator for SEO farms and spitting out ads at
             | such a rate it can be near useless for some questions.
             | 
             | Now the real question is, can we tell when Google or GPT is
             | dumping bullshit on us.
        
         | dqpb wrote:
         | Prove that this is actually happening.
        
           | tiborsaas wrote:
           | I've talked to people and read comments a lot, but there's no
           | proof that you'd probably accept. My impression is that this
           | attitude definitely exists. Some people are already ditching
           | search engines and rely mostly on ChatGPT, some are even
           | talking about AI tech in general with religious awe.
        
         | hummus_bae wrote:
         | Although AI can't be trusted, we can trust that AI can't be
         | trusted.
        
         | computerex wrote:
         | This is why when I spot a Tesla on the road, I make every
         | effort to try and get as far away from it as possible. Placing
         | a machine vision model at the helm of a multi-ton vehicle has
         | got to be one of the dumbest things the regulators have let
         | Elon get away with.
        
           | jackmott42 wrote:
           | Tesla isn't the only brand that is doing that, and most of us
           | didn't pay for it and are driving ourselves. So maybe pretend
           | to be terrified by something else.
        
             | GuB-42 wrote:
             | Other brands tend not to rely so much on a vision model for
             | their driving assistance features, instead relying more on
             | a variety of sensors, traditional computation, and curated
             | databases.
        
               | jayd16 wrote:
               | ChatGPT is bad...but ChatGPT with a radar is something
               | I'll trust.
        
               | veb wrote:
               | If it has a radar, then it should have something to do
               | with the sky... hmm, what would be a good name?
        
           | withinboredom wrote:
           | Saw a Tesla get pulled over on the Autobahn for holding the
           | left lane and being too slow. That was one of my favorite
           | moments on the road. Second only to watching a team of
           | truckers box in an asshole driver while going up a mountain.
        
           | Der_Einzige wrote:
           | Agreed, they tend to drive extremely badly and I don't trust
           | them. Avoid them, anything nissan and lifted trucks
           | (especially dodge RAMs). Stats show that a solid 5% of all
           | ram 1500 drivers have a DUI on record. It's like 0.1% for
           | prius drivers.
           | 
           | re:machine vision - What's particularly bad is that they
           | don't actually put enough sensors in the cars to be safe.
           | 
           | Pre-collision systems, blind spot monitors, radar, parking
           | sensors, etc are all so helpful and objectively good for
           | drivers. Doing a vision only implementation and then claiming
           | "full self driving" is where it gets awful.
        
           | _joel wrote:
           | Has it been proven to be worse than normal drivers that pass
           | their test?
        
             | jldugger wrote:
             | Where would that data live and how would we disentangle it
             | from income effects?
        
               | oakashes wrote:
               | I don't think you would need to disentangle it from
               | income effects to address the concern of someone who
               | actively avoids Teslas on the road, unless they also
               | actively avoid cars that look like they belong to people
               | with low income.
        
               | jldugger wrote:
               | > I don't think you would need to disentangle it from
               | income effects to address the concern
               | 
               | Good point, I was mainly thinking of the 'AI saves lives'
               | argument.
               | 
               | > of someone who actively avoids Teslas on the road,
               | unless they also actively avoid cars
               | 
               | Probably they should avoid hitting _any_ cars =)
        
             | kypro wrote:
             | What does worse mean? Tesla's AI drives worse than an 80
             | year old with cataracts, but on the other hand it can react
             | faster to obstacles than the fastest race-car drivers.
             | 
             | I don't own a Tesla so have no direct experience, but my
             | guess would be that it might crash less than a human, but
             | has far more near-misses.
        
               | generalizations wrote:
               | Worse probably means something like "accidents per mile".
               | 
               | Edit: was curious, so I did the math. The standard unit
               | seems to be 'crashes per 100 million miles' so according
               | to the Tesla safety report (only source on Autopilot
               | safety I could find easily) that works out to [0]: one
               | accident for every 4.41 million miles driven = 22.7
               | accidents per 100 million miles.
               | 
               | Without autopilot, the AAA foundation (again, easily
               | available source, feel free to find a better one, this is
               | from 2014) [1] says that our best drivers are in the
               | 60-69 age range, and have 241 accidents per 100 million
               | miles. Our worst drivers are our younger teens (duh) with
               | 1432 crashes per 100 million miles.
               | 
               | So unless you can find better data that contradicts this,
               | Autopilot seems like a net benefit.
               | 
               | [0] https://insideevs.com/news/537818/tesla-autopilot-
               | safety-rep...
               | 
               | [1] https://aaafoundation.org/rates-motor-vehicle-
               | crashes-injuri...
        
               | alldayeveryday wrote:
               | I think your analysis is a good one given the data we
               | have, and can be used to draw some conclusions or guide
               | general discussion. However the analysis is indeed
               | limited by the data available. The AAA data does not
               | consider variability by gender, race, socioeconomics,
               | location, etc. Further it does not normalize variability
               | in the types of driving being done (teslas have limited
               | range, are not towing trailers, etc), nor other
               | technological advances (modern vs older vehicles).
        
               | danso wrote:
               | Autopilot only works on highways and decent weather and
               | visibility -- i.e. the least accident-prone scenarios
               | possible. AFAIK, AAA's stats count accidents in all
               | situations
        
               | kjkjadksj wrote:
               | Id wager for a modern car it doesn't crash any less. My
               | modern Toyota also stops faster than I can thanks to a
               | big radar unit hooked up to the brakes.
        
               | fragmede wrote:
               | In fact, almost all new cars, Teslas included, since
               | 2022, have an Automatic Emergency Breaking system (AEB),
               | which will hit the breaks if you're about to hit
               | something. If I were walking in a parking lot and had to
               | step in front of an older SUV or a Tesla, I'd step in
               | front of the Telsa.
               | 
               | https://www.jdpower.com/cars/shopping-guides/what-is-
               | automat...
        
             | flandish wrote:
             | Consider this:
             | 
             | A tesla's fsd code had gone through the sdlc they choose,
             | involving project management.
             | 
             | I distrust project management.
             | 
             | Therefore: I only trust a tesla's fsd mode as much as I
             | trust a project manager's driving.
        
             | grogenaut wrote:
             | Do you need it to be proven scientifically to take caution
             | on something? I take caution around every car.
        
             | [deleted]
        
             | FartyMcFarter wrote:
             | Whether it's "proven" to do so probably depends on what
             | kind of proof you're looking for.
             | 
             | But there are plenty of news articles and youtube videos
             | that show it doing ridiculous unsafe things (including
             | running into a truck that was driving perpendicular to the
             | road). So I highly doubt it's as good as a normal driver,
             | in fact I'd be shocked if it is.
        
               | sebzim4500 wrote:
               | >But there are plenty of news articles and youtube videos
               | that show it doing ridiculous unsafe things
               | 
               | There are also plenty of videos of human drivers doing
               | absurd things, so this is hardly an argument.
               | 
               | All that matters is whether a random Tesla that you see
               | on the street is more likely to crash into you than a
               | different car. I know that Tesla has published there own
               | statistics which say that it isn't, but I would be very
               | interested in seeing an independent study about this.
        
         | gl-prod wrote:
         | Because the term AI linked with chatting bot misleads people
         | into thinking it's something like AGI or Iron Man's JARVIS.
        
         | yetihehe wrote:
         | If we delete first sentence of your post and leave only
         | 
         | > People are too willing to give up their responsibility of
         | critical thought to some kind of omnipotent messiah figure.
         | 
         | This essentially describes humans probably since before we
         | became homo sapiens. We again and again choose into positions
         | of power those who can look competent instead of actual
         | competent people.
        
           | LesZedCB wrote:
           | i wish somebody would write an entire science fiction series
           | about this! maybe set on a desert planet, after humanity made
           | the same mistake with intelligent machines as well
           | 
           | https://en.wikipedia.org/wiki/Dune_(novel)
        
       | insane_dreamer wrote:
       | What shocks me is not that Bing got a bunch of stuff wrong, but
       | that:
       | 
       | - The Bing team didn't check the results for their __demo__ wtaf?
       | Some top manager must have sent down the order that "Google has
       | announced their thing, so get this out TODAY".
       | 
       | - The media didn't do fact checking either (though I hold them
       | less accountable than the Bing/Msft team)
        
       | chasd00 wrote:
       | if ChatGPT could ask questions back it would be a very effective
       | phishing tool. People put a lot of blind faith in what they
       | perceive as intelligence. You know, a MITM attack on a chatbot
       | could probably be used to get a lot of people to do anything
       | online or IRL.
        
       | TEP_Kim_Il_Sung wrote:
       | AI should probably stick to selling paperclips. There's no chance
       | to screw that up.
        
       | partiallypro wrote:
       | AI can't be trusted in general, at least not for a long time. It
       | gets basic facts wrong, constantly. The fear is that it will
       | start eating its own dogfood and being more and more wrong since
       | we are putting it in the hands of people that don't know any
       | better and are going to use it to generate tons of online content
       | that will later be used in the models.
       | 
       | It does make some queries much easier to find, for instance I had
       | trouble finding out if the runner ups got the win in the Tour De
       | France after the Armstrong doping scandal and it answered it
       | instantly. The problem is that is offers answers with confidence,
       | I think them adding citation is an improvement over ChatGPT, but
       | it needs more.
       | 
       | Luckily, it's still a beta product and not in the hands of
       | everyone. Unfortunately, ChatGPT is, which I find more
       | problematic.
        
       | Havoc wrote:
       | Surprised anyone is getting excited about these mistakes at all.
       | Expecting them to be fully accurate is simply not realistic
       | 
       | The fact that they're producing anything coherent at all is a
       | feat
        
         | BaseballPhysics wrote:
         | Uh, the technology is being integrated into a _search engine_.
         | It 's job is to surface real information, not made up BS.
         | 
         | No one would be "getting excited" about this if Microsoft
         | wasn't selling this as the future of search.
        
       | m3kw9 wrote:
       | If it flops on certain information and the UI is. It properly
       | adjusted to limit certain things is does poorly, it will back
       | fire on MS
        
       | oldstrangers wrote:
       | I had this idea the other day concerning the 'AI obfuscation' of
       | knowledge. The discussion was about how AI image generators are
       | designed to empower everyone to contribute to the design process.
       | But I argued that you can only reasonably contribute to the
       | process if you can actually articulate the reasoning beyond your
       | contributions. If an AI made it for you, you probably can't,
       | because the reasoning is simply "this is the amalgamation of
       | training data that the AI spat out." But, there's a realistic
       | version of reality where this becomes the norm and we
       | increasingly rely on AI to solve for issues that we don't
       | understand ourselves.
       | 
       | And, perhaps more worrying, the more widely adopted AI becomes,
       | the harder it becomes to correct its mistakes. Right now millions
       | of people are being fed information they don't understand, and
       | information that's almost entirely incorrect or inaccurate. What
       | is the long term damage from that?
       | 
       | We've obfuscated the source data and essentially the entire
       | process of learning with LLMs / AIs, and the path this leads down
       | seems pretty obviously a net negative for society (outside of
       | short term profit for the stake holders).
        
         | kneebonian wrote:
         | I've said it before and I'll warn of it again here, my biggest
         | concern for AI, especially at this stage is that we abscond
         | understanding, in favor of letting the AI generate, then the AI
         | generates that which we do not understand, but must maintain.
         | Then we don't know why we are doing what we are doing but we
         | know that it causes things to work how we want.
         | 
         | Suddenly instead of our technology being defined by reason and
         | understanding our technology is shrouded in mysticism, and
         | ritual. Pretty soon the whole thing devolves into the tech
         | people running around in red robes, performing increasingly
         | obtuse rituals to appease "the machine spirit", and praying to
         | the Omnissiah.
         | 
         | If we ever choose to abandon our need for understanding we will
         | at that point have abandoned our ability to progress.
        
         | nix0n wrote:
         | People are already misusing statistical models, in ways that
         | are already causing harm to people.
         | 
         | See this HN thread from 2016[0], which also points to [1](a
         | book) and [2](PDF).
         | 
         | I definitely agree with you that it's going to get a lot worse
         | with AI, since it makes it harder to see that it is a
         | statistical model.
         | 
         | [0]https://news.ycombinator.com/item?id=12642432
         | [1]https://www.amazon.com/Weapons-Math-Destruction-Increases-
         | In... [2]https://nissenbaum.tech.cornell.edu/papers/biasincompu
         | ters.p...
        
         | Madmallard wrote:
         | ChatGPT can give you a full description of why it made the
         | decision it did and it usually is fairly accurate.
        
       | danans wrote:
       | What the hype machine still doesn't understand is that it's a
       | _language_ model, not a knowledge model.
       | 
       | It is optimized to generate information that looks as much like
       | language as possible, not knowledge. It may sometimes regurgitate
       | knowledge if it is simple or well trodden enough knowledge, or if
       | language trivially models that knowledge.
       | 
       | But if that knowledge gets more complex and experiential, it will
       | just generate words without attachment to meaning or truth,
       | because fundamentally it only knows how to generate language, and
       | it doesn't know how to say "I don't know that" or "I don't
       | understand that".
        
       | pphysch wrote:
       | LLM+Search has to be all about ad injection, right?
       | 
       | As a consumer, it seems the value of LLM/LIM(?) is advanced
       | autocomplete and concept/content generation. I would pay some
       | money for these features. LLM+Search doesn't appeal to me much.
        
         | mtmail wrote:
         | "With deeply personalized experiences we expect to be able to
         | deliver even more relevant messages to consumers, with the goal
         | of improved ROI for advertisers."
         | 
         | https://about.ads.microsoft.com/en-us/blog/post/february-202...
        
       | tasty_freeze wrote:
       | Supposedly, Joseph Weisenbaum logged the chat logs of Eliza so he
       | could better see where his list of canned replies was falling
       | short. He was horrified to find that people were really
       | interacting with it as if understood them.
       | 
       | If people fell for the appearance of AI that resulted from a few
       | dozen canned replies and a handful of heuristics, I 100% believe
       | that people will be taken in by ChatGPT and ascribe it far more
       | intelligence than it has.
        
         | LesZedCB wrote:
         | papers are coming out weekly about their emergent properties.
         | 
         | despite people wanting transformers to be nothing more than
         | fancy, expensive excel spreadsheets, their capabilities are far
         | from simple or deterministic.
         | 
         | the fact that in-context learning is getting us 80%ish of the
         | way to tailored behavior is just fucking incredible. they _are_
         | definitely, meaningfully intelligent in some (not-so-small)
         | way.
         | 
         | this paper[1] goes over quite a few examples and models
         | 
         | [1] https://storage.googleapis.com/pub-tools-public-
         | publication-...
        
       | flandish wrote:
       | >Bing
       | 
       |  _No_ AI can be trusted. FTFY.
        
       | seydor wrote:
       | I cant wait for the era of conversational web so i can do away
       | with clickbait titles and opinions. Truly everyone has one. The
       | experiment with "open publishing" has so far only proved that
       | signal to noise remains constant
        
         | notacoward wrote:
         | > so i can do away with clickbait titles and opinions
         | 
         | Do you actually think that will be the result? Why not the
         | _exact opposite_? ChadGPT and the others are for all practical
         | purposes trained to create content that is superficially
         | appealing and plausible - i.e. perhaps not clickbait but a
         | related longer-form phenomenon - without any underlying insight
         | or connection to truth. That would make conversational AI even
         | _more_ of a time sink than today 's clickbait. Why do you
         | imagine it would turn out otherwise?
        
       | thorum wrote:
       | The errors when summarizing the Gap financial report summary are
       | quite surprising to me. I copied the same source paragraph (which
       | is very clearly phrased) into ChatGPT and it summarized it
       | accurately.
       | 
       | Is it possible they are 'pre-summarizing' long documents with
       | another algorithm before feeding them to GPT?
        
       | coliveira wrote:
       | I think ChatGPT and their lookalikes spell the end of the public
       | internet as we know it. People now have tools to generate pages
       | as they seem fit. Google will not be able to determine what are
       | high quality pages if everything looks the same and is generated
       | by AI bots. Users will be unable to find trustworthy results, and
       | many of these results will be filled with generated garbage that
       | looks great but is ultimately false.
        
       | wwwpatdelcom wrote:
       | I have been trying to help folks understand what the underlying
       | mechanisms of these generative LLM's are so it's not such a
       | surprise when we get wrong answers from them by putting together
       | some youtube videos on the topic.
       | 
       | * [On the question of replacing
       | Engineers](https://www.youtube.com/watch?v=GMmIol4mnLo)
       | 
       | * [On AI Plagiarism](https://www.youtube.com/watch?v=whbNCSZb3c8)
       | 
       | The consensus seems to be building now on HackerNews that there
       | is a huge over-hype. Hopefully these two videos help see some of
       | the nuance behind why it's an over-hype.
       | 
       | That being said, being that language generation is probabilistic,
       | a given language model which is transformer based can either be
       | trained or fine-tuned to have fewer errors in a particular domain
       | - so this is all far from settled.
       | 
       | Long-term, I think we're going to see something closer to human
       | intelligence from CNN's and other forms of neural networks than
       | from transformers, which are really a poor man's NN. As hardware
       | advances and NN's inevitably become cheaper to run, we will
       | continue to see scarier and scarier A.I. -- I'm talking over a
       | 10-20 year timeframe.
        
         | whimsicalism wrote:
         | HN was always going to be overly pessimistic with regards to
         | this stuff, so this was utterly predictable.
         | 
         | I work in this field & it almost pains me to see it come into
         | the mainstream and see all of the terrible takes that pundits
         | can contort this into, ie. LLM as a "lossy jpeg of the
         | internet" (bad, but honestly one of the better ones).
        
           | wwwpatdelcom wrote:
           | Yes..."Lossy JPEG," at least describes the idea that there
           | is, _some_ kind of  "subsampling," going on, rather than
           | just...a magical box?
           | 
           | I think most laypeople understand the simple statement, "it's
           | a parrot."
           | 
           | I had the original author of this paper reach out to me about
           | my plagiarism video on Mastodon:
           | 
           | https://dl.acm.org/doi/10.1145/3442188.3445922
           | 
           | The idea of a lossy JPEG/Parrot helps capture the idea that
           | there are dangers and opportunities in LLM's. You can have
           | fake or doctored images spread, you can have a Parrot swear
           | at someone and cause un-needed conflict - but they can also
           | be great tools and/or cute and helpful companions, as long as
           | we understand their limitations.
        
             | sebzim4500 wrote:
             | The statement "it's a parrot" may be simple to understand
             | but frankly I don't think many people who have used chatGPT
             | will believe it.
             | 
             | At least "lossy JPEG" feels vague enough to be
             | unfalsifiable.
        
             | whimsicalism wrote:
             | The issue is that it doesn't just recreate things it was
             | trained on, it generates _novel content_. There is no
             | reason that novel pathways of  "thought" (or whatever makes
             | one comfortable) aren't emergent in a model under
             | optimization & regularization.
             | 
             | This is what the "lossy compression" and "stochastic
             | parrot" layperson models do not capture. Nonetheless,
             | people will lap them up. They want a more comfortable
             | understanding that lets them avoid having to question their
             | pseudo-belief in souls and the duality of mind and body.
             | Few in the public seem to want to confront the idea of the
             | mind as an emergent phenomenon from interactions of neurons
             | in the brain.
             | 
             | It is not simply regurgitating training data like everyone
             | seems to want it to.
        
               | wwwpatdelcom wrote:
               | But what is novel content?
               | 
               | I can easily falsify the accusation that, "people
               | underestimate transformers and don't see that they are
               | actually intelligent," by defeating the best open-source
               | transformer-based word embedding (at the time) with a
               | simple TF-DF based detector (this was back in September).
               | 
               | https://www.patdel.com/plagiarism-detector/
               | 
               | No, these things are not, "emergent," they are just
               | rearranging numbers. You don't have to use a transformer
               | or neural network at all to re-arrange numbers and create
               | something that is even more, "artificially intelligent,"
               | than one that does use transformers it turns out!
        
               | hackinthebochs wrote:
               | >No, these things are not, "emergent," they are just
               | rearranging numbers.
               | 
               | This is a bad take. Most ways to "rearrange numbers"
               | produce noise. That there is a very small subset of
               | permutations that produce meaningful content, and the
               | system consistently produces such permutations, is a
               | substantial result. The question of novelty is whether
               | these particular permutations have been seen before, or
               | perhaps are simple interpolations of what has been seen
               | before. I think its pretty obvious the space of possible
               | meaningful permutations is much larger than what is
               | present in the training set. The question of novelty then
               | is whether the model can produce meaningful output (i.e.
               | grammatically correct, sensible, plausible) in a space
               | that far outpaces what was present in the training
               | corpus. I strongly suspect the answer is yes, but this is
               | ultimately an empirical question.
        
               | wwwpatdelcom wrote:
               | I would love to read anything you have written about the
               | topic at length. Thanks for your contribution.
        
               | whimsicalism wrote:
               | I can tell that this conversation is not going to be
               | super productive, so a few brief thoughts:
               | 
               | > I can easily falsify the accusation that, "people
               | underestimate transformers and don't see that they are
               | actually intelligent,"
               | 
               | I think that you have an idiosyncratic definition of what
               | "falsify" means compared to what most might. Getting away
               | from messy definitions of "intelligent" which I think are
               | value-laden, I see nothing in your blog post that
               | falsifies the notion that LLMs can generate novel content
               | (another fuzzy value-laden notion perhaps).
               | 
               | > these things are not, "emergent," they are just
               | rearranging numbers.
               | 
               | It seems non-obvious to me that 'rearranging numbers'
               | cannot lead to anything emergent out of that process, yet
               | cascading voltage (as in our brain) can.
        
               | wwwpatdelcom wrote:
               | I would love to read anything you have written or studied
               | about this topic at length. Thanks for your replies.
        
               | noobermin wrote:
               | >There is no reason that novel pathways of "thought" (or
               | whatever makes one comfortable) aren't emergent in a
               | model under optimization & regularization.
               | 
               | Please substantiate this assertion. People always just
               | state it as a fact without producing an argument for it.
        
               | whimsicalism wrote:
               | You're asking me to substantiate a negative - ie.
               | identify any possible reason someone might provide that
               | novel behavior might not be emergent out of a model under
               | optimization and then disprove it, but ahead of time.
               | This is a challenging task.
               | 
               | Our minds are emergent out of the interaction of billions
               | of neurons in our brain. Each is individually pretty
               | dumb, just taking in voltage and outputting voltage (to
               | somewhat oversimplify). Out of that simple interaction &
               | under the pressures of evolutionary optimization, we have
               | reached a more emergent whole.
               | 
               | Linear transformations stacked with non-linearities can
               | similarly create an individually dumb input and output
               | that under the pressure of optimization lead to a more
               | emergent whole. If there is a reason why this has to be
               | tied to voltage regulating neuron substrate, I have yet
               | to see a compelling one.
        
               | mtlmtlmtlmtl wrote:
               | I think its unfair and asinine to caricature sceptics as
               | ignorant people in denial, holding on to some outdated
               | idea of a soul. That's the sort of argument someone makes
               | when they're so entrenched in their own views they see
               | nothing but their own biases.
        
               | sebzim4500 wrote:
               | Being sceptical of chatGPT is entirely reasonable, and
               | there is plenty of room for discussion on exactly when we
               | will hit the limits of scalining LLMs.
               | 
               | No one who has used chatGPT more than a couple of times
               | will argue in good faith that it is a "parrot", however,
               | unless they have an extremely weird definition of
               | "parrot".
        
               | whimsicalism wrote:
               | Ask people to describe how they think the mind functions
               | and you will very often get something very akin to soul-
               | like belief. Many, many people are not very comfortable
               | with the mind as emergent phenomenon. A straight majority
               | of people in the US (and likely globally) believe in
               | souls when polled, you are the one imputing the words of
               | "ignorant people in denial" onto my statement of why
               | people find views to the contrary uncomfortable.
               | 
               | I understand that HN is a civil community. I don't think
               | it is crossing the line to characterize people I disagree
               | with as wrong and also theorize on why they might hold
               | those wrong beliefs. Indeed, you are doing the same thing
               | with my comment - speculating on why I might hold views
               | that are 'asinine' because I see 'nothing but [my] own
               | biases.'
        
               | mtlmtlmtlmtl wrote:
               | I'm not saying it's not true of most people in the world,
               | but that doesn't make it a constructive argument. And you
               | didn't use the words ignorant and denial, but they're
               | reasonable synonyms to what you did say.
               | 
               | When I do the "same thing" I'm really saying that when
               | you represent yourself as from the field, you might want
               | to cultivate a more nuanced view of the people outside
               | the field, if you want to be taken seriously.
               | 
               | Instead, given the view you presented, I'm forced to give
               | your views the same credence I give a physicist who says
               | their model of quantum gravity is definitely the correct
               | one. I.e: "sure, you'd say that, wouldn't you"
        
               | whimsicalism wrote:
               | I am providing a reason why "the public" might be
               | uncomfortable around these ideas. You accuse me of
               | misrepresenting the public's beliefs as ignorant and
               | outdated when really the public has a nuanced view on
               | this subject. I am merely taking the majority of people
               | at their word when they are polled on the subject.
               | 
               | Most people believe in souls. Most people do not believe
               | in minds as emergent out of interactions of neurons. I am
               | not sure how to cultivate a more nuanced view on this
               | when flat majorities of people say when asked that they
               | hold the belief I am imputing on them.
               | 
               | Am I saying that this is where all skepticism comes from?
               | No. Is it a considerable portion? Yes.
        
               | SketchySeaBeast wrote:
               | If we think of the tools as generating entirely novel
               | content then I'd suggest we're using them for the wrong
               | thing here - we shouldn't be using it at all as a
               | glorified (now literal) search engine, it should be
               | exploring some other space entirely. If we discovered a
               | brand new sentient creature we wouldn't immediately try
               | to fill its head with all the knowledge on the internet
               | and then force it to answer what the weather will be
               | tomorrow.
        
               | whimsicalism wrote:
               | I have no idea what sentience really means, but I think
               | novel content generation is a necessary but not
               | sufficient component.
        
               | SketchySeaBeast wrote:
               | True, I was overly grandiose. Regardless, we're taking
               | something that can apparently generate new intellectual
               | content, but we're using it as a beast of burden.
        
       | 1vuio0pswjnm7 wrote:
       | When Google's Bard AI made a mistake, GOOG share price dropped
       | over 7%.
       | 
       | What about Baidu's Ernie AI.
       | 
       | Common retort to criticism of conversational AI is "But it's
       | useful."
       | 
       | Yes, it is useful as a means to create hype that can translate to
       | increases in stock price increase and increased web traffic (and
       | thereby increased revenue from advertising services).
       | 
       | https://www.reuters.com/technology/chinas-baidu-finish-testi...
        
       | jmount wrote:
       | It can't be emphasized enough, this isn't a procedure failing
       | when used- this is a canned recording of it failing. This means
       | the group either didn't check the results, or did check them and
       | saw no way forward other than getting this out the door. It is
       | only small samples, but it is fairly damning that it is hard to
       | produce error free curated examples.
        
       | ddren wrote:
       | Out of curiosity, I searched the pet vacuum mentioned in the
       | first example, and found it on amazon [0]. Just like Bing says,
       | it is a corded model with a 16 feet cord, and searching the
       | reviews for "noise" shows that many people think that it is too
       | loud. At least in this case, it seems that Bing got it right.
       | 
       | [0]: https://www.amazon.com/Bissell-Eraser-Handheld-Vacuum-
       | Corded...
        
         | dboreham wrote:
         | Curious why someone would keep a vacuum as a pet.
        
         | Merad wrote:
         | Bing actually got tripped up by HGTV simplifying a product name
         | in their article. It used this HGTV [0] article as its source
         | for the top pet vacuums. The article lists the "Bissell Pet
         | Hair Eraser Handheld Vacuum" and links to [1] which is actually
         | named "Bissell Pet Hair Eraser Lithium Ion Cordless Hand
         | Vacuum". The product you found is the "Bissell Pet Hair Eraser
         | Handheld Vacuum, Corded." A human likely wouldn't even notice
         | the difference because we'd just follow the link in the
         | article, or realize the corded vacuum was the wrong item based
         | on its picture, but Bing has no such understanding.
         | 
         | [0]: https://www.hgtv.com/shopping/product-reviews/best-
         | vacuums-f...
         | 
         | [1]: https://www.amazon.com/BISSELL-Eraser-Lithium-Handheld-
         | Cordl...
        
         | jiggyjace wrote:
         | Yeah this is my experience cross-checking the article with my
         | own Bing AI. Try and replicate the Appendix section and Bing AI
         | gets everything right for me.
        
       | kibwen wrote:
       | Our exposure to smart-sounding chatbots is inducing a novel form
       | of pareidolia: https://en.wikipedia.org/wiki/Pareidolia .
       | 
       | Our brains are pattern-recognition engines and humans are social
       | animals; together that means that our brains are predisposed to
       | anthropomorphizing and interpreting patterns as human-like.
       | 
       | For the whole of human history thus far, the only things that we
       | have commonly encountered that conversed like humans have been
       | other humans. This means that when we observe something like
       | ChatGPT that appears to "speak", we are susceptible to
       | interpreting intelligence where there is none, in the same way
       | that an optical illusion can fool your brain into perceiving
       | something that is not happening.
       | 
       | That's not to say that humans are somehow special or that or
       | human intelligence is impossible to replicate. But these things
       | right here aren't intelligent, y'all. That said, can they be
       | useful? Certainly. Tools don't need to be intelligent to be
       | useful. A chainsaw isn't intelligent, and it can still be highly
       | useful... and highly destructive, if used in the wrong way.
        
         | pixl97 wrote:
         | >we are susceptible to interpreting intelligence where there is
         | none,
         | 
         | I disagree as this is much to simple of statement. You have had
         | near daily dealings with less than human intelligences for most
         | of your life, we call them animals. We realize they have a wide
         | range of intelligence from extremely simple behavior to near
         | human competency.
         | 
         | This is why I dismiss your 'not intelligent yet' statement. The
         | problem we lack here is one of precise language when talking
         | about the components of intelligence and the wide range in
         | which it manifests.
        
       | frereubu wrote:
       | For me the fundamental issue at the moment for ChatGPT and others
       | is the tone it replies in. A large proportion of the information
       | in language is in the tone, so someone might say something like
       | "I'm pretty sure that the highest mountain in Africa is Mount
       | Kenya" whereas ChatGPT instead says "the highest mountain in
       | Africa is Mount Kenya", and it's the "is" in the sentence that's
       | the issue. So many issues in language revolve around "is" - the
       | certainty is very problematic. It reminds me of a tutor at art
       | college who said too many people were producing "thing that look
       | like art". ChatGPT produces sentence that look like language, and
       | because of "is" they read as quite compelling due to the
       | certainty it conveys. Modify that so it says "I think..." or "I'm
       | pretty sure..." or "I reckon..." and the sentence would be much
       | more honest, but the glamour around it collapses.
        
         | esotericimpl wrote:
         | [dead]
        
       | Plough_Jogger wrote:
       | I have a feeling we will see a resurgence of some of the ideas
       | around expert systems; current language models inherently cannot
       | provide guarantees of correctness (unless e.g., entire facts are
       | tokenized together, but this limits functionality significantly).
        
       | bambax wrote:
       | > _Bing AI can 't be trusted_
       | 
       | Of course it can't. No LLM can. They're bullshit generators. Some
       | people have been saying it from the start, and now everyone is
       | saying it.
       | 
       | It's a mystery why Microsoft is going full speed ahead with this.
       | A possible explanation is that they do this to annoy / terrify
       | Google.
       | 
       | But the big mystery is, why is Google falling for it? That's
       | inexplicable, and inexcusable.
        
         | Nemo_bis wrote:
         | > It's a mystery why Microsoft is going full speed ahead with
         | this.
         | 
         | Maybe they had some idle GPU capacity in some DC or they needed
         | to cross-subsidize Azure to massage the stock market
         | multipliers, or something.
        
       | coffeeblack wrote:
       | It just goog... ehm bings your question and then summarizes what
       | the resulting web pages say. Works well, but ChatGPT works much
       | better.
        
       | JoshTko wrote:
       | Hot take, chat GPT rises and crashes fast after SEO optimization
       | shifts to ChatGPT optimization.
        
       | userbinator wrote:
       | I don't know if it's started to use AI for regular search
       | queries, but I noticed within the past week or two that Bing
       | results got _much_ worse. It seems it doesn 't even respect
       | quoting anymore, and the second and subsequent pages of results
       | are almost entirely duplicates of the first. I normally use Bing
       | when Google fails to yield results or decides to hellban me for
       | searching too specifically, and for the past few years it was
       | acceptable or even occasionally better, but now it's much worse.
       | If that's the result of AI, then _do not want!!!_
        
         | joe_the_user wrote:
         | Well, reworking Bing and Google for a ChatGPT interface is
         | going to be massive hardware and software enterprise. And there
         | are a lot of questions involved to say the least.
         | 
         | Where will the software engineer come from? We're in a belt-
         | tightening part of the business cycle and FANGs have a pressure
         | not to hire, so you assume the existing engineers. But these
         | engineers are now working on real things so those real things
         | may suffer. Which brings actual profits? The future AI thing or
         | the present? The future AI is unavoidable given the
         | possibilities are visible and the competition is on but a "shit
         | shows" of various sorts seem very possible.
         | 
         | Where will the hardware and the processing power come from?
         | There are estimates of server power consumption quintupling [1]
         | but these are arbitrary - even if it just doubles, just
         | "plugging the cords" in takes time. And where would the new
         | TPUs/GPUs come from? TSMC has a capacity determined by
         | investments already made and much of that capacity is allotted
         | already - more capacity anywhere would involve massive capital
         | allocation and what level of increased profits will pay for
         | this?
         | 
         | [1] https://www.wired.com/story/the-generative-ai-search-race-
         | ha...
        
         | Eduard wrote:
         | > I normally use Bing when Google fails to yield results...
         | 
         | Every once in a while I hear someone at Hacker News hitting the
         | dead end with Google Search. Can you give an example where
         | Google search fails, but other search engines (e.g. Bing)
         | provide results? Must be fringe niche topics, no?
         | 
         | >... or decides to hellban me for searching too specifically
         | 
         | Is hellbanning a thing at Google? What happens if one gets
         | hellbanned?
        
       | rvz wrote:
       | There is no point in hyping about a 'better search engine' when
       | this continues to hallucinate incorrect and inaccurate results.
       | It is now reduced to a 'intelligent sophist' instead of a search
       | engine. Once many realise that it also frequently hallucinates
       | nonsense, it is essentially no better than Google Bard.
       | 
       | After looking at the limitations of ChatGPT and Bing AI it is now
       | clear that they aren't reliable enough to even begin to challenge
       | search engines or even cite their sources properly. LLMs are just
       | limited to bullshit generators which is what this current AI hype
       | is all about.
       | 
       | Until all of these AI models are open-sourced and transparent
       | enough to be trustworthy or if a competitor does it instead, then
       | there is nothing revolutionary about this AI hype other than a AI
       | SaaS using a creative Clubhouse-like waitlist mania.
        
       | mnd999 wrote:
       | Of course it can't. That you're even surprised by this enough to
       | write a blog post is more worrying.
        
         | password54321 wrote:
         | Which part of the post did the author convey surprise that it
         | can't be trusted? It just seems like a response to the mass
         | hype currently surrounding AI.
        
           | mnd999 wrote:
           | Nobody writes a blog called '1 + 1 = 2' do they? That would
           | be obvious and dull. It stands to reason the author thought
           | there was something surprising or interesting about it, or
           | why would they bother?
        
       | beebmam wrote:
       | I already don't trust virtually any search results except
       | grep/rg.
        
       | mojo74 wrote:
       | To follow up on the author's example Bing search doesn't even
       | know when the new Avatar is film is actually out (DECEMBER 17
       | 2021?)
       | 
       | https://www.bing.com/search?q=when+is+the+new+avatar+film+ou...
       | 
       | Bing AI doesn't stand a chance.
        
       | cwkoss wrote:
       | I think this is a weird non-issue and it's interesting people are
       | so concerned about it.
       | 
       | - Human curated systems make mistakes.
       | 
       | - Fiction has created the trope of the omniscient AI.
       | 
       | - GPT curated systems also make mistakes.
       | 
       | - People are measuring GPT against the omniscient AI mythology
       | rather than the human systems it could feasibly replace.
       | 
       | - We shouldn't ask "is AI ever wrong" we should ask "is AI wrong
       | more often than the human-curated information? (There are levels
       | of this - min wage truth is less accurate that senior engineer
       | truth.)
       | 
       | - Even if the answer is that AI gets more wrong, surely a system
       | where AI and humans are working together to determine the truth
       | can outperform a system that is only curated by either alone.
       | (for the next decade or so, at least)
        
         | nirvdrum wrote:
         | I think there's an issue with gross misrepresentation. This
         | isn't being sold as a system with 50% accuracy where you need
         | to hold its hand. It's sold as a magical being that can answer
         | all of your questions and we know that's how people will treat
         | it. I think this is a worse situation than data coming from
         | humans since people are skeptical of one another. But, many
         | think AI will be an impartial, omnipotent source of facts, not
         | a bunch of guesses that might be right slightly more often than
         | than it's wrong.
        
           | cwkoss wrote:
           | I see your point, but I feel like there's going to be a
           | 'eating tidepods' level societal meme within a year mocking
           | people who fall for AI hallucinations as "boomers", and then
           | omnipotent AI myth will be shattered.
           | 
           | Essentially, I believe the baseline misinformation level is
           | being undercounted by many and so the delta in the interim
           | while people are learning the fallibility of AI is small
           | enough it is not going to cause significant issues.
           | 
           | Also the 'inoculation' effect of getting the public using
           | LLMs could result in a net social benefit as the common man
           | will be skeptical of authorities appealing to AI to justify
           | actions - which I think could be much more dangerous that
           | Suzie copying hallucinated facts into her book report.
        
             | nirvdrum wrote:
             | If the only negative effect is some people look foolish,
             | that's an acceptable risk. I'm worried a bit it's closer to
             | people thinking that Tesla has a full self-driving system
             | because Tesla called it auto-pilot and demonstrated videos
             | of the car driving without a human occupant. In that case,
             | yeah the experts understand that "auto-pilot" still means
             | driver-assisted, but we can't ignore the fact that most
             | people don't know that and that the marketing info
             | reinforced the wrong ideas.
             | 
             | I don't want to argue with people that won't understand the
             | AI model can be wrong. I'm far more concerned with public
             | policy being driven by made up facts or someone responding
             | poorly in an emergency situation because a search engine
             | synthesized facts. Outside of small discussions here, I
             | don't see any acknowledgment about the current limitations
             | of this technology, only the sunny promises of greener
             | pastures.
        
         | Barrin92 wrote:
         | >we should ask "is AI wrong more often than the human-curated
         | information?
         | 
         | No, this isn't what we should ask, we should ask if the
         | interface that AI provides is conducive to giving humans the
         | ability to detect the mistakes that it makes.
         | 
         | The issue isn't how often you get wrong information, it's _to
         | what extent you 're able to spot wrong information_ under
         | normal use cases. And the uniform AI interface that gives you
         | complete bullshit in the technical sense of that term provides
         | no indication regarding the trustworthiness of the information.
         | A source with 20% of wrong info that you don't notice is worse
         | than one with 80% that you identify.
         | 
         | When you use traditional search you get an unambigious source,
         | context, date, language, authorship and so forth and you must
         | place what you read yourself. You know the onus is on you.
         | ChatGPT is the half self-driving car. It'an inherently
         | pathological interaction because everything in the design
         | screams to take the hands off the wheel. It's an opaque system,
         | and a blackbox with the error rate of a human is a disaster.
         | Human-machine interaction is not human-human interaction.
        
         | 10rm wrote:
         | I agree 100% with your last point, even as someone who is
         | relatively more skeptical of GPT than the average person.
         | 
         | I think a lot of the concern though is coming from the way the
         | average person is reacting to GPT and the way they're using it.
         | The issue isn't that GPT makes mistakes, it's that people (by
         | their own fault, not GPT necessarily) get a false sense of
         | security from GPT, and since the answers are provided in a
         | concise, well-written format don't apply the same skepticism
         | they do when searching for something. That's my experience at
         | least.
         | 
         | Maybe people will just get better at using this, the tools will
         | improve, and it won't be as big an issue, but it feels like a
         | trend from Facebook to TikTok of people opting for more easily
         | digestible content at the expense of disinformation
        
           | cwkoss wrote:
           | Interesting points.
           | 
           | - I wonder what proportion of people who are getting a false
           | sense of security with GPT also were getting that same false
           | sense from human systems. Will this shift entail a net
           | increase in gullibility, or is this just 'laundering'
           | foolishness?
           | 
           | - I think the average tiktok user generally has much better
           | media literacy than average facebook user. But probably
           | depends a lot on your filter bubble.
        
       | dqpb wrote:
       | > Bing AI did a great job of creating media hype, but their
       | product is no better than Google's Bard
       | 
       | Remind me, how do I access Bard?
        
       | andrewstuart wrote:
       | AI providers really need to set expectations correctly.
       | 
       | They are getting into trouble by allowing people to think the
       | answers will be correct.
       | 
       | They should be stating up front that AI tries to be correct but
       | isn't always and you should verify the results.
        
       | imranq wrote:
       | Unfortunately this overhyped launch has started the LLM arms
       | race. Consumers don't seem to care in general about factuality as
       | long as they can get an authoritative sounding answer that is
       | somewhat accurate...at least for now
        
         | wg0 wrote:
         | Somewhat opposite - if LLMs continue to perform like that with
         | made up information and such, their credibility would erode
         | over time and a defacto expectation would be that they don't
         | work or aren't accurate which would result in less being
         | reliant on them.
         | 
         | Same like self driving cars didn't have a mainstream
         | breakthrough yet.
        
         | SketchySeaBeast wrote:
         | This hasn't really been put in front of consumers, has it? This
         | is all very niche - how many even know that there is a Bing AI
         | thing going on? I think it's far too early to make statements
         | about what people think or want.
        
           | Tepix wrote:
           | OpenAI raced past 100 million users, that's hardly niche. All
           | tech people i've talked to have played around with it. Some
           | use it every day.
        
             | SketchySeaBeast wrote:
             | But is it a product or a toy for the majority of those
             | users?
        
             | at-fates-hands wrote:
             | As someone who does SEO on a regular basis, I thought it
             | would be brilliant to have this write content for you.
             | Google already made updates to its algo to ferret out
             | content that is created by AI and list it as spam.
             | 
             | I figure we're going to see a lot of guard rails being put
             | up as this gains wider usage to try and cut off nefarious
             | uses of it. I know right now, there are people who have
             | already figured out how to bypass the filters and are
             | selling services on the dark web that cater to people who
             | want to use it for malware and other scams:
             | 
             |  _Hackers have found a simple way to bypass those
             | restrictions and are using it to sell illicit services in
             | an underground crime forum, researchers from security firm
             | Check Point Research reported._
             | 
             | https://arstechnica.com/information-
             | technology/2023/02/now-o...
        
           | teraflop wrote:
           | Right now, if you go to bing.com, there's a big "Introducing
           | the new Bing" banner, which takes you to the page about their
           | chatbot. You have to get on a waitlist to actually use it,
           | though.
        
             | SketchySeaBeast wrote:
             | So it's limited to those who use bing and who opt in? Still
             | fairly niche in that case.
        
         | Nathanba wrote:
         | Unfortunately? This is the best kind of arms race, the one
         | where we race towards technology that is ultimately going to
         | help all of humanity.
        
           | SketchySeaBeast wrote:
           | I'm trying to decide if this is a valid arms race or jumping
           | the gun. Kind of feels like if someone came up with auto
           | racing before the invention of the ICE and so they spend a
           | bunch of time pushing race cars around the track only for
           | them all to realize this isn't working and give upon the
           | whole idea.
        
         | arduinomancer wrote:
         | I think its more like Tesla Autopilot
         | 
         | In the beginning there was lots of hype because you couldn't
         | use it
         | 
         | But now that its in consumer hands there's tons of videos of it
         | messing up, doing weird stuff
         | 
         | To the point that its now common knowledge that autopilot is
         | not actually magical AI driving
        
       | eppp wrote:
       | Bing AI gets a pass because it's disruptive. Google doesn't
       | because it is the incumbent. Mystery solved.
        
       | elorant wrote:
       | I frequently use ChatGPT to research various topics. I've noticed
       | that eight out of 10 times I ask it to recommend some books about
       | a topic it recommends non-existing books. There's no way I'd
       | trust a search engine built on it.
        
         | FleurBouquet wrote:
         | [dead]
        
         | BubbleRings wrote:
         | There is really no other way to think of them, in terms of
         | reliability, than lying bastards. I mean, ChatGPT is very fun
         | and quite useful, but think of it. Anybody that has played with
         | it for even an hour has been confidently lied to by it,
         | multiple times. If you keep friends around that treat you like
         | that, you need a better friend picker! (Maybe an AI could
         | help.)
        
           | elorant wrote:
           | ChatGPT has no concept of truth or lie. It's a language model
           | that uses statistical models to predict what to say next.
           | Your assumptions about its intentions reflect only your bias.
        
       | noobermin wrote:
       | Reading this, this honestly made me afraid honestly, like Bing AI
       | is a tortured soul, semi-conscious, stuck in a box. I'm not sure
       | how I feel about this[0].
       | 
       | [0] https://twitter.com/vladquant/status/1624996869654056960
        
         | gptgpp wrote:
         | Really?
         | 
         | I think that the first example one of the funniest things I've
         | read today.
         | 
         | The second example, getting caught in a predictive loop, is
         | also pretty funny considering it's supposed to be proving it's
         | conscious (eg. not an LLM, prone to looping like that lol).
         | 
         | The last one, littered with emojis and repeating itself like a
         | deranged ex is just _chefs kiss._
         | 
         | Thanks for that.
        
         | coffeebeqn wrote:
         | It's just good at acting. I'm sure it can be led to behave in
         | almost any way imaginable given the right prompts
        
       | jerf wrote:
       | I have come to two conclusions about the GPT technologies after
       | some weeks to chew on this:
       | 
       | 1. We are so amazed by its ability to babble in a confident
       | manner that we are asking it to do things that it should not be
       | asked to do. GPT is basically the language portion of your brain.
       | The language portion of your brain does not do logic. It does not
       | do analyses. But if you built something very like it and asked it
       | to try, it might give it a good go.
       | 
       | In its current state, you really shouldn't _rely_ on it for
       | anything. But people will, and as the complement of the Wile E.
       | Coyote effect, I think we 're going to see a lot of people not
       | realize they've run off the cliff, crashed into several rocks on
       | the way down, and have burst into flames, until after they do it
       | several dozen times. Only then will they look back to realize
       | what a cockup they've made depending on these GPT-line AIs.
       | 
       | To put it in code assistant terms, I expect people to be
       | increasingly amazed at how well they seem to be coding, until you
       | put the results together at scale and realize that while it
       | kinda, sorta works, it is a new type of never-before-seen crap
       | code that nobody can or will be able to debug short of throwing
       | it away and starting over.
       | 
       | This is not because GPT is broken. It is because what it is is
       | not correctly related to what we are asking it to do.
       | 
       | 2. My second conclusion is that this hype train is going to crash
       | and sour people quite badly on "AI", because of the pervasive
       | belief I have seen even here on HN that this GPT line of AIs _is_
       | AI. Many people believe that this is the beginning and the end of
       | AI, that anything true of interacting with GPT is true of AIs in
       | general, etc.
       | 
       | So people are going to be even more blindsided when someone
       | develops an AI that uses GPT as its language comprehension
       | _component_ , but does this higher level stuff that we _actually_
       | want sitting on top of it. Because in my opinion, it 's pretty
       | clear that GPT is producing an _amazing_ level of comprehension
       | of what a series of words means. The problem is, that 's _all_ it
       | is really doing. This accomplishment should not be understated.
       | It just happen to be the fact that we 're basically abusing it in
       | its current form.
       | 
       | What it's going to do as a _part_ of an AI, rather than the whole
       | thing, is going to be amazing. This is certainly one of the hard
       | problems of building a  "real AI" that is, at least to a first
       | approximation, solved. Holy crap, what times we live in.
       | 
       | But we do not have this AI yet, even though we think we do.
        
         | phire wrote:
         | Sentient AIs in science fiction are always portrayed as being
         | more-or-less infallible, at least when referencing their own
         | knowledge banks.
         | 
         | Then ChatGPT comes along and starts producing responses good
         | enough that people feel like almost sentient AI. And they
         | suddenly start expecting it to share the infallibility that
         | fictional AIs have always possessed.
         | 
         | But it's not a sentient AI. It's just a language model. Just a
         | beefed up auto-correct. I'm very impressed just what
         | capabilities a language model gets when you throw this many
         | resources at it (like, it seems to be able to approximate logic
         | and arithmetic to decent accuracy, which is unexpected).
         | 
         | Also... even if it was a sentient AI, why would it be
         | infallible? Humans are sentient, and nobody ever accused us of
         | being infallible.
        
           | spikder wrote:
           | The lack of consistency is a big issue. It may well be able
           | to organize your trip to mexico, but then it tells me that
           | "the product of two primes must be prime because each factor
           | is prime" ... how will one ever trust it? Moreover, how to
           | use it?
           | 
           | If a Tesla can get you there with 1% human intervention, but
           | that happens to be the 1% that would have killed you had you
           | not intervened ... how do we interface with such systems?
        
           | hackinthebochs wrote:
           | >But it's not a sentient AI. It's just a language model. Just
           | a beefed up auto-correct.
           | 
           | There is a large space between "sentient" and "beefed up
           | autocorrect". Why do people insist on going for the most
           | reductive description they can muster?
        
             | pixl97 wrote:
             | Because the average person you speak to would consider
             | beefed up autocorrect to be near magic as it is. Once you
             | get near to the limits of an individuals comprehension
             | adding more incomprehensible statements/ideas doesn't
             | really change much, their answer is still 'magic'.
        
         | rdedev wrote:
         | Really liked your analogy on GPT being similar to the language
         | center of the brain. Almost all current methods to teach GPT
         | deductive logic has been through an inductive approach; giving
         | it training examples on how to do deduction. Thing is it might
         | be possible to reach 80% of the way there with more data and
         | parameters but a wall will be hit sooner or later
        
         | btown wrote:
         | I love the mental model of GPT as only one part of the brain,
         | but I believe that the integration of other "parts" of the
         | brain will come sooner than you think. See, for instance,
         | https://twitter.com/mathemagic1an/status/1624870248221663232 /
         | https://arxiv.org/abs/2302.04761 where the language model is
         | used to create training data that allows it to emit tokens that
         | function as lookup oracles by interacting with external APIs.
         | And an LLM can itself understand when a document is internally
         | inconsistent, relative to other documents, so it can integrate
         | the results of these oracles if properly trained to do so.
         | We're only at the surface of what's possible here!
         | 
         | I also look to the example of self-driving cars - just because
         | Tesla over-promised, that didn't discourage its competitors
         | from moving forward slowly but surely. It's hard to pick a
         | winner right now, though - so much culturally in big tech is up
         | in the air with the simultaneity of layoffs and this sea change
         | in AI viability, it's hard to know who will be first to release
         | something that truly feels rock-solid.
        
           | phire wrote:
           | Yes, this is something that I've been thinking ever since
           | GPT3 came out.
           | 
           | It's insanely impressive what it can do given it's just a
           | language model. But if you start gluing on more components,
           | we could end up with a more or less sentient AGI within a few
           | years.
           | 
           | Bing have already hooked it up to a search engine. That post
           | hooks it up to other tools.
           | 
           | I think what is needed next is a long term memory where it
           | can store dynamic facts and smartly retrieve them later,
           | rather than relying on the just the 4000 token current
           | window. It needs to be able to tell when a user is circling
           | back to a topic they talked about months ago and pull out the
           | relevant summaries of that conversation.
           | 
           | I also think it needs a working memory that it continually
           | edits the token window to fit the relevant state of the
           | conversation. Summarising recent tokens, saving things out
           | long term storage, pulling new infomation in from long term
           | storage, web searches and other tools.
        
             | pixl97 wrote:
             | I think a number of breakthroughs may be need to keep an AI
             | 'sane' with a large working memory at this point. How do we
             | keep them 'on track' at least in a way that seems somewhat
             | human. Humans that have halting problem issues can either
             | be geniuses (diving into problems and solving them to the
             | point of ignoring their own needs), or clinical (ignoring
             | their needs to look at a spot on the wall).
        
         | SergeAx wrote:
         | > new type of never-before-seen crap code that nobody can or
         | will be able to debug short of throwing it away and starting
         | over
         | 
         | Good thing is that we are dealing with exactly same type of
         | code here and there for tens of years already. Actually, every
         | time I see a commercial codebase not exactly like a yarn of
         | spaghetti, I thank gods for it, because it is not a rule, but
         | an exception.
         | 
         | What I really wonder is what it will be like when the next
         | version of the same system will be coded from the ground up by
         | next version of the same ML model?
        
         | epups wrote:
         | I agree completely with the first part of your post. However, I
         | think even performing these language games should definitely be
         | considered AI. In fact, understanding natural language queries
         | was considered for decades a much more difficult problem than
         | mathematical reasoning. Issues aside, it's clear to me we are
         | closer to solving it than we ever have been.
        
           | jerf wrote:
           | Sorry, I didn't mean that LLMs are not a _subset_ of AI. They
           | clearly are. What they are not is _equal to_ AI; there are
           | things that are AI that are not LLMs.
           | 
           | It is obvious when I say it, but my internal language model
           | (heh) can tell a lot of people are not thinking that way when
           | they speak, and the latter is often more reliable than how
           | people _claim_ they are thinking.
        
             | pixl97 wrote:
             | I think the problem here is in a classification of what is
             | ( I ) in the first place. For us to answer the question of
             | what equals AI we must first answer the question of what
             | equals human intelligence in a self consistent, logical,
             | parsable manner.
        
         | fullshark wrote:
         | Incremental improvements and it getting to the point of good
         | enough for a set of tasks but maybe not all tasks seems far
         | more likely.
        
         | dekervin wrote:
         | I have bookmarked your comment and I hope to have the
         | discipline to come back to it every 3 months or so for the next
         | couple of years. Because I think you are right but I didn't
         | noticed it before. When the real things cole, we will probably
         | be blindsided.
        
         | jamespking wrote:
         | [dead]
        
         | skilled wrote:
         | I think what I would add to your comment, and specifically
         | criticize the HN hype around it, is that all these GPT "AI"
         | tools are entirely dependent on the OpenAI API. ChatGPT might
         | have shown a glimpse of spark by smashing two rocks together,
         | but it is nowhere near being able to create a full-blown fire
         | out of it.
         | 
         | Outside of Google and OpenAI, I doubt there is a single team in
         | the world right now that would be capable of recreating ChatGPT
         | from scratch using their own model.
        
           | dagw wrote:
           | _I doubt there is a single team in the world right now that
           | would be capable of recreating ChatGPT from scratch using
           | their own model._
           | 
           | Why not? Lack of knowhow or lack or resources? If say Baidu
           | decided to spend a billion dollars on this problem, don't you
           | think they have the skills and resources to quickly catch up.
        
             | pixl97 wrote:
             | It depends on the nature of the problem at hand.
             | 
             | For example if we threw money at a group in 1905 do you
             | think they could have come up with special relativity, or
             | do you believe that it required geniuses working on the
             | problem to have a breakthrough.
        
           | jerf wrote:
           | I would love to know how much of ChatGPT is "special sauce"
           | and how much of it is just resources thrown at the problem at
           | a scale no one else currently wants to compete with.
           | 
           | I am not making any implicit claims here; I really have no
           | idea.
           | 
           | I'm also not counting input selection as "special sauce";
           | while that is certainly labor-intensive, it's not what I
           | mean. I mean more like, are the publicly-available papers on
           | this architecture sufficient, or is there some more math not
           | published being deployed?
        
           | Q6T46nT668w6i3m wrote:
           | Meta?
        
         | iamflimflam1 wrote:
         | We're just seeing the standard hype cycle. We're in the "Peak
         | of Inflated Expectations" right now. And a lot of people are
         | tumbling down into the "Trough of Disillusionment"
         | 
         | Behind all the hype and the froth there are people who are
         | finding uses and benefits - they'll emerge during the "Slope of
         | Enlightenment" phase and then we'll reach the "Plateau of
         | Productivity".
        
         | bitL wrote:
         | "babble in a confident manner"
         | 
         | OK, so we figured out how to automate away management jerks.
         | Isn't that a success?
        
         | 6510 wrote:
         | > I have come to two conclusions about the GPT technologies
         | after some weeks to chew on this:
         | 
         | <sarcasm>Just 2 weeks of training data? Surely the conclusions
         | are not final? No doubt a lot has changed over those 2 weeks?
         | 
         | I think the real joke is still, Q: "what is intelligence?" A:
         | "We don't know, all we know is that you are not a good example
         | of it".
         | 
         | I fear these hilarious distortions are only slightly different
         | from those we mortals make all the time. They stand out because
         | we would get things wrong in different ways.
         | 
         | > 1. We are so amazed by its ability to babble in a confident
         | manner that we are asking it to do things that it should not be
         | asked to do.
         | 
         | God, where have we seen this before? The further up the human
         | hierarchy the more elaborate the insanity. Those with the most
         | power, wealth and even those of us with the greatest intellect
         | manage to talk an impressive amount of bullshit. We all do it
         | up to our finest men.
         | 
         | The only edge we have over the bot is that we know when to keep
         | our thoughts to ourselves when it doesn't help our goal.
         | 
         | To do an idiotic time line of barely related events which no
         | doubt describes me better than it describes the topic:
         | 
         | I read how a guy who contributed much to making TV affordable
         | enough for everyone. He thought it was going to revolutionize
         | learning from home. Finally the audience for lectures given by
         | our top professors could be shared with everyone around the
         | globe!
         | 
         | We got the internet, the information supper highway, everyone
         | was going to get access to the vast amount of knowledge
         | gathered by mankind. It only took a few decades for google to
         | put all the books on it. Or wait....
         | 
         | And now we got the large language models. Finally someone who
         | can tell us everything we want to know with great confidence.
         | 
         | These 3 were and will be instrumental in peddling bullshit.
         | 
         | Q: Tell me about the war effort!
         | 
         | what I want to hear: "We are winning! Just a few more tanks!"
         | 
         | what I don't want to hear: "We are imploding the world economy!
         | Run to the store and buy everything you can get your hands on.
         | Cash is king! Arm yourself. Buy a nuclear bunker."
         | 
         | Can one tell people that? It doesn't seem in line with the
         | bullshit we are comfortable with?
         | 
         | > GPT is basically the language portion of your brain. The
         | language portion of your brain does not do logic. It does not
         | do analyses. But if you built something very like it and asked
         | it to try, it might give it a good go.
         | 
         | At least it doesn't have sinister motives (we will have to add
         | those later)
         | 
         | > In its current state, you really shouldn't rely on it for
         | anything. But people will, and as the complement of the Wile E.
         | Coyote effect, I think we're going to see a lot of people not
         | realize they've run off the cliff, crashed into several rocks
         | on the way down, and have burst into flames, until after they
         | do it several dozen times. Only then will they look back to
         | realize what a cockup they've made depending on these GPT-line
         | AIs.
         | 
         | It seems to me that we are going to have to take the high horse
         | and claim the low road.
        
         | spikder wrote:
         | To add to your point, current technology does not even suggest
         | if we will ever have such an AI. I personally doubt it. Some
         | evidence: https://en.wikipedia.org/wiki/Entscheidungsproblem.
         | 
         | This is like trying to derive the laws of motion by having a
         | computer analyze 1 billion clips of leaves fluttering in the
         | wind.
        
         | m3047 wrote:
         | I hearken back before dot-bomb and occasionally people would
         | ask me to work on "web sites" which they'd built with desktop
         | publishing software (e.g. ColdFusion).
         | 
         | They'd hand me the code that somebody would've already hacked
         | on. Oftentimes, it still had the original copyright statements
         | in it. Can't get the toothpaste back in the tube now! Plus it's
         | shitcode. Where is that copy of ColdFusion? Looks of complete
         | dumbfoundment.
         | 
         | Oh gee kids, my mom's calling me for lunch; gotta go!
        
         | boh wrote:
         | I think the ultimate problem with AI is its overvalued as a
         | technology in general. Is this "amazing level of comprehension"
         | really that necessary given the amount of time/money/effort
         | devoted to it? What's become clear with this technology that's
         | been inaccurately labeled as "AI" is that it doesn't produce
         | economically relevant results. It's a net expense anyway you
         | slice it. It's like seeing a magician perform an amazing trick.
         | It's both amazing and entirely irrelevant at the same time. The
         | "potential" of the technology is pure marketing at this point.
        
           | kneebonian wrote:
           | It seems to me it is really good at writing. I would think it
           | could replace the profession of techincal writing for the
           | most part, it could help you write emails, (bring back clippy
           | MS you cowards), it could be used as a frontend to an
           | FAQ/self service help type system.
        
             | boh wrote:
             | Have you read the article? You'd have to have 100% faith in
             | the tech to allow it to side-step an actual person. Unless
             | your site is purely a click-farm, you're still probably
             | hiring someone to check it--so what's the point of having
             | it?
        
           | pixl97 wrote:
           | I mean, I take if I stuck you back in 1900 you'd say the same
           | about flying. "Look at all this wasted effort for almost
           | nothing". And then pretty quickly the world rapidly changed
           | and in around 50 years we were sending things to space.
           | 
           | Intelligence isn't just one thing, really I would say its the
           | emergent behavior of a bunch of different layers working
           | together. The LLM being just one layer. As time goes on and
           | we add more layers to it the usefulness of the product will
           | increase. At least from a selfish perspective of a
           | corporation, whoever can create one of these intelligences
           | may have the ability to free themselves of massive amounts of
           | payroll by using the AI to replace people.
           | 
           | The potential of AI should not be thought of any differently
           | than the potential of people. You are not magic, just
           | complicated.
        
             | boh wrote:
             | I don't get the point of comparing apples to make a point
             | about oranges. Flying isn't AI. Nor is "progress" a
             | permanent state. If you want to stay in the flying
             | comparison: in 2000 you can fly from NY to Paris in 3 hours
             | on a Concord, something no longer possible in 2023. Why?
             | Because economics made it unfeasible to maintain. Silicon
             | Valley has made enough promises using "emergent" behavior
             | and other heuristics to justify poor investments.
             | Unfortunately it's taken out too many withdrawals from its
             | bank of credibility and there'a not enough to cloud their
             | exit schemes in hopes and dreams.
        
         | bsaul wrote:
         | I'm not sure whether the hype train is going to crash, or
         | whether only a few very smart companies, using language
         | problems for what they're really good at (aka: generate non-
         | critical texts), will manage to revolutionize one field.
         | 
         | We're at the very first beginning of the wave, so everybody is
         | a bit overly enthusiastic, dollars are probably flowing, and
         | ideas are popping everywhere. Then will come a harsh step of
         | selection. The question is what will the remains look like, and
         | how profitable they'll be. Enough to build an industry, or just
         | niche.
        
         | adverbly wrote:
         | It is like we have unlocked an entirely new category of
         | stereotyping that we never even realized existed.
         | 
         | Intelligence is not a prerequisite to speak fancifully.
         | 
         | Some other examples:
         | 
         | 1. We generally assume that lawyers or CEOs or leaders who give
         | well spoken and inspirational speeches actually know anything
         | about what they're talking about.
         | 
         | 2. Well written nonsense papers can fool industry experts even
         | if the expert is trying to apply rigorous review:
         | https://en.m.wikipedia.org/wiki/Sokal_affair
         | 
         | 3. Acting. Actors can easily portray smart characters by
         | reading the right couple sentences off a script. We have no
         | problem with this as an audience member. But CGI is needed for
         | making your superhero character jump off a building without
         | becoming a pancake.
        
           | pixl97 wrote:
           | >Intelligence is not a prerequisite to speak fancifully.
           | 
           | I think this may be a bastardization of the word
           | intelligence. To speak fancifully an a manner accepted by the
           | audience requires some kind of ordered information processing
           | and understanding of the audiences taste. Typically we'd
           | consider that intelligent, but likely Machiavellian depending
           | on the intent.
           | 
           | The problem with the word intelligence is it is too big of
           | concept. If you look at any part of our brain, you will not
           | find (human)intelligence itself, instead it emerges from any
           | number of processes occurring at different scales. Until we
           | are able to break down intelligence into these smaller better
           | (but not perfectly) classified pieces we are going to keep
           | running into these same problems over and over again.
        
           | WalterBright wrote:
           | > easily portray smart characters
           | 
           | I don't think it is possible for people to emulate the
           | behavior of superintelligent beings. In every story about
           | them, they appear to not actually be any smarter than us.
           | 
           | There is one exception - Brainwave by Poul Anderson. He had
           | the only credible (to me) take on what super intelligent
           | people might be like.
        
             | jodrellblank wrote:
             | Rupert Sheldrake suggests that consciousness is partly
             | about seeing possibilities for our future, evaluating them,
             | and choosing between them. If we make decisions the same
             | way, they change to unconscious habits.
             | 
             | A hungry creature can eat what it sees or stay hungry.
             | Another has more memory and more awareness of different
             | bark and leaves and dead animals to choose from. Another
             | has a better memory of places with safe food in the past
             | and how to get to them. A tool using human can reason down
             | longer chains like 'threaten an enemy and take their food'
             | or 'setup a trap to kill an animal' or 'dig up root, grind
             | root into mash, boil it, eat the gruel'. In that model, a
             | super intelligence might be able to:
             | 
             | - Extract larger patterns from less information. (Con: more
             | risk of a mistake).
             | 
             | - Connect more patterns or more distant patterns together
             | with less obvious connections. (Con: risk of self-
             | delusion).
             | 
             | - Evaluate longer chains of events more accurately with a
             | larger working memory, more accurate mental models. (Con:
             | needs more brain power, more energy, maybe longer time
             | spent in imagination instead of defending self).
             | 
             | - Recall more precise memories more easily. (Con: cost of
             | extra brain to store informaiton and validate memories).
             | 
             | This would be a good model for [fictional] Dr House, he's
             | memorised more illnesses, he's more attentive to observing
             | small details on patients, and more able to use those to
             | connect to existing patterns, and cut through the search
             | space of 'all possible illnesses' to a probable diagnosis
             | based on less information than the other doctors. They run
             | out of ideas quicker, they know fewer diseases, and they
             | can't evaluate as long chains of reasoning from start to
             | conclusion, or make less accurate conclusions. In one
             | episode, House meets a genius physicist/engineer and wants
             | to get his opinion on medical cases, but the physicist
             | declines because he doesn't have the medical training to
             | make any sense of the cases.
             | 
             | It also suggests that extra intelligence might get eaten up
             | by other people - predicting what they will do, while they
             | use their extra intelligence to try to be unpredictable.
             | And it would end up as exciting as a chess final, where
             | both grandmasters sit in silence trying to out-reason their
             | opponent through deeper chains in a larger subset of all
             | possible moves until eventually making a very small move.
             | And from the outside players all seem the same but they can
             | reliably beat me and they cannot reliably beat each other.
        
             | twic wrote:
             | I remember thinking when i read it that Ted Chiang's
             | 'Understand' did a good job (although have not re-read it
             | to verify this):
             | 
             | https://web.archive.org/web/20140527121332/http://www.infin
             | i...
        
         | jgtrosh wrote:
         | > the language portion of your brain does not do logic
         | 
         | This seems ... Wrong? I suppose that most of what we generally
         | call high-level logic is largely physically separate from some
         | basic functions of language, but just a blanket statement
         | describing logic and language as two nicely separate functions
         | cannot be a good model of the mind.
         | 
         | I also feel like this goes to the core of the debate, is there
         | any thought going on or is it _just_ a language model; I 'm
         | pretty sure many proponents of AI believe that thought is a
         | form of very advanced language model. Just saying the opposite
         | doesn't help the discussion.
        
         | ccozan wrote:
         | Exactly. Is like a mouth speaking without brain. We need a
         | secondary "reasoning" AI that can process the GPT further ,
         | adding time/space coordonates and as well as basic logic
         | including counting , and _then_ maybe we see something I can
         | rely on.
        
           | withinboredom wrote:
           | There's a "really old" book called _On Intelligence_ that
           | suggests modeling AI like the brain. This pattern is almost
           | exactly what he suggests.
        
           | danans wrote:
           | > We need a secondary "reasoning" AI that can process the GPT
           | further
           | 
           | We also need "accountability" and "consequences" for the AI,
           | whatever that means (we'd first have to define what "desire"
           | means for it).
           | 
           | In the example from the article, the Bing GPT completely
           | misrepresented the financial results of a company. A human
           | finance journalist wouldn't misrepresent those results due to
           | fear for their loss of reputation, and their desire for fame,
           | money, and acceptance. None of those needs exist for an LLM.
        
             | pixl97 wrote:
             | To note, this is what we call the AI alignment problem.
             | 
             | https://www.youtube.com/watch?v=zkbPdEHEyEI
        
         | evo_9 wrote:
         | Yeah, I read this sentiment all the time and here's what I
         | always say - just don't use it. Leave it to the rest of us if
         | it's so wrong / off / bad.
         | 
         | BTW, have you considered maybe you aren't so good at using it?
         | A friend has had very little luck with it, even said he's been
         | 'arguing with it', which made me laugh. I've noticed that it's
         | not obvious to most people that it's mostly about knowing the
         | domain well enough to ask the right question(s). It's not
         | magic, it won't think for you.
         | 
         | Here's the thing... my experience is the opposite... but maybe
         | I'm asking it the right questions. Maybe it's more about using
         | it to reason through your problem in a dialog, and not just ask
         | it something you can google/duckduckgo. It seems like a LOT of
         | people think it's a replacement for Google/search engines -
         | it's not, it's another tool to be used correctly.
         | 
         | Here are some examples of successful uses for me:
         | 
         | I carefully explained a complex work issue that involves
         | multiple overlapping systems and our need to get off of one of
         | them in the middle of this mess. My team has struggle for 8
         | months to come up with a plan. While in a meeting the other day
         | I got into a conversation with ChatGPT about it, carefully
         | explained all the details and then asked it to create a plan
         | for us to get off the system while keeping everything up /
         | running. It spit out a 2 page, 8 point plan that is nearly 100%
         | correct. I showed it to my team, and we made a few minor
         | changes, and then it was anointed 'the plan' and we're actually
         | moving forward.
         | 
         | THEN last night I got stuck on a funny syntax issue that
         | googling could never find the answer. I got into a conversation
         | with ChatGPT about it, and after it first gave me the wrong
         | answer, I told it that I need this solution for the latest
         | dontet library that follows the 'core' language syntax. It
         | apologized! And then gave me the correct answer...
         | 
         | My hunch is the people that are truly irked by this are too
         | deep / close to the subject and because it doesn't match up
         | with what they've worked on, studied, invested time, mental
         | energy into, well then of course it's hot garbage and 'bad'.
        
           | daveguy wrote:
           | > My hunch is the people that are truly irked by this are too
           | deep / close to the subject and because it doesn't match up
           | with what they've worked on, studied, invested time, mental
           | energy into, well then of course it's hot garbage and 'bad'.
           | 
           | That's quite the straw man you've built. Recognizing the
           | limitations of a technology is not the same as calling it hot
           | garbage.
           | 
           | As a language model it's amazing, but I agree with the GP.
           | It's not intelligent. It's very good at responding to a
           | series of tokens with its own series of tokens. That requires
           | a degree of understanding of short scale context that we
           | haven't had before in language models. It's an amazing
           | breakthrough.
           | 
           | But it's also like attributing the muscle memory of your hand
           | to intelligence. It can solve lots of problems. It can come
           | up with good configurations. It is not, on its own,
           | intelligent.
        
           | pclmulqdq wrote:
           | Just to flip this around for a second, with both of your
           | examples, it sounds like you may have a problem with writer's
           | block or analysis paralysis, and ChatGPT helped you overcome
           | that simply due to the fact that it isn't afraid of what it
           | doesn't know. If that helps you, go for it.
           | 
           | On the other hand, it could also help you to just write a
           | random plan or try a few random things when you get stuck,
           | instead of trying to gaze deeply into the problem for it to
           | reveal its secrets.
        
           | gptgpp wrote:
           | You all say it's solving these amazing complex tasks for you,
           | but then don't provide any details.
           | 
           | Then "naysayers" like the linked article provide a whole
           | document with images and appendixes showing it struggles with
           | basic tasks...
           | 
           | So show us. For the love of god all of us would very much
           | LIKE this technology to be good at things! Whatever
           | techniques you're using to get these fantastical results, why
           | don't you share them?
           | 
           | I can get it to provide snippets of code, CLI, toy functions
           | that work. Beyond that, I am apparently an idiot compared to
           | you AI-whisperers.
           | 
           | Also... Whatever happened to "extraordinary claims require
           | extraordinary proof?"
           | 
           | An AI that creates a complex system, condensed into an
           | actionable plan, that has stumped an entire team for 8 months
           | is a (pardon the language) bat-shit insane claim. Things like
           | this used to require proof to be taken seriously.
        
             | pncnmnp wrote:
             | I can provide an example.
             | 
             | I have found ChatGPT to be a valuable tool for improving
             | the clarity and readability of my writing, particularly in
             | my blogs and emails. You can try this by asking questions
             | such as "Can you improve the grammar of the following
             | paragraphs?". You can also specify the desired tone.
             | 
             | It is impressive at simplifying complex technical language.
             | Take the following sentences from a draft I wrote:
             | 
             | To mitigate these issues, it is recommended to simulate the
             | effect of say n random permutations using n random hash
             | functions (h1, h2, ... hn) that map the row numbers (say 1
             | to k) to bucket numbers of the same range (1 to k) without
             | a lot of collisions. This is possible if k is sufficiently
             | large.
             | 
             | What ChatGPT suggested:
             | 
             | To address these issues, it's recommended to simulate the
             | effect of n random permutations using n random hash
             | functions (h1, h2, ... hn). These hash functions should map
             | the row numbers (from 1 to k) to bucket numbers within the
             | same range (1 to k) with minimal collisions. This is
             | achievable if the range k is large enough.
        
               | elliotec wrote:
               | Try Grammarly. It's extremely good at this, and with an
               | incredible UX.
        
               | pncnmnp wrote:
               | Yes, I've been using Grammarly for several years now. I
               | still use it in conjunction with ChatGPT. It's efficient
               | in correcting spelling and basic grammar errors. However,
               | more advanced features are only available to premium
               | users. At present, their $12/m fee is a bit steep for me.
        
             | oezi wrote:
             | My take: Because GPT is just stochasticly stringing words
             | after each other, it is remarkedly good at producing text
             | on par with other text available on the internet. So it can
             | produce plans, strategies, itineraries and so on. The more
             | abstract the better. The 8 point plan is likely great.
             | 
             | It will much more likely fail on anything which involves
             | precision/computation/logic. That's why it can come up with
             | an generic strategy but fail to repeat unadjusted GAAP
             | earnings.
        
               | gptgpp wrote:
               | I agree it's pretty good at generalities, doesn't shit
               | the bed quite so much. Yet to suggest a plan that an
               | entire team of professionals, who have been working for 8
               | months could not figure out?
               | 
               | It's certainly not that good, absent some amazing
               | wizardry or some very silly professionals in a very
               | squishy field. Yet I have no explanation for why someone
               | would go on the internet and lie about something like
               | that.
               | 
               | There were comments a while back (less so now) of people
               | making other claims like it was solving complex functions
               | for them and writing sophisticated software.
               | 
               | The entire thing baffles me. If I could get it to do
               | that, I'd be showing you all of my marvelous works and
               | bragging quite a bit as your newfound AI-whisperer. Hell,
               | I'd get it to write a script for me to run that
               | evangelized itself (edit: and me of course, as its chosen
               | envoy to mankind) to the furthest corners of the
               | internet!
        
               | harpiaharpyja wrote:
               | There was an article not too long ago, that I'm
               | struggling to find, that did a great job of explaining
               | why language models are much much better suited to
               | reverse-engineering code than they are at forward-
               | engineering it.
        
               | [deleted]
        
           | 10rm wrote:
           | When did they say it's garbage? They gave their opinions on
           | its shortcomings and praised some of the things it excels at.
           | You're calling the critics too emotional but this reply is
           | incredibly defensive.
           | 
           | Your anecdotes are really cool and a great example of what
           | GPT can do really well. But as a technical person, you're
           | much more aware of its limitations and what is and isn't a
           | good prompt for it. But as it is more and more marketed to
           | the public, and with people already clamoring to replace
           | traditional search engines with it, relying on the user to
           | filter out disinformation well and not use it for prompts it
           | struggles with isn't good enough.
        
           | whimsicalism wrote:
           | I feel similarly reading many critiques, but honestly the GP
           | is one of the more measured ones that I've read - not sure
           | that your comment is actually all that responsive or
           | proportionate.
        
           | timmytokyo wrote:
           | I don't understand this take. These LLM-based AIs provide
           | demonstrably incorrect answers to questions, they're being
           | mass-marketed to the entire population, and the correct
           | response to this state of affairs is "Don't use it if you
           | don't know how"? As if that's going to stop millions of
           | people from using it to unknowingly generate and propagate
           | misinformation.
        
             | roywiggins wrote:
             | Isn't that what people said about Google Search 20 years
             | ago- that people won't know how to use it, that they will
             | find junk information, etc. And they weren't entirely
             | wrong, but it doesn't mean that web search isn't useful.
        
               | seunosewa wrote:
               | Can you share any source for the claim about what people
               | said about Google Search?
        
               | allturtles wrote:
               | No, I don't recall anyone saying that. They mostly said
               | "this is amazingly effective at finding relevant
               | information compared to all other search engines." Google
               | didn't invent the Web, so accusing it of being
               | responsible for non-factual Web content would have been a
               | strange thing to do. Bing/Chat-GPT, on the other hand, is
               | manufacturing novel non-factual content.
        
               | 10rm wrote:
               | That's a good point. I don't think anyone is denying that
               | GPT will be useful though. I'm more worried that because
               | of commercial reasons and public laziness / ignorance,
               | it's going to get shoehorned into use cases it's not
               | meant for and create a lot of misinformation. So a
               | similar problem to search, but amplified
        
               | bun_at_work wrote:
               | There are some real concerns for a technology like
               | ChatGPT or Bing's version or whatever AI. However, a lot
               | of the criticisms are about the inaccuracy of the model's
               | results. Saying "ChatGPT got this simple math wrong"
               | isn't as useful or meaningful of a criticism when the
               | product isn't being marketed as a calculator or some
               | oracle of truth. It's being marketed as an LLM that you
               | can chat with.
               | 
               | If the majority of criticism was about how it could be
               | abused to spread misinformation or enable manipulation of
               | people at scale, or similar, the pushback on criticism
               | would be less.
               | 
               | It's nonsensical to say that ChatGPT doesn't have value
               | because it gets things wrong. What makes much more sense
               | is to say is that it could be leveraged to harm people,
               | or manipulate them in ways they cannot prevent.
               | Personally, it's more concerning that MS can embed high-
               | value ad spots in responses through this integration,
               | while farming very high-value data from the users, wrt
               | advertising and digital surveillance.
        
           | youk wrote:
           | Great write up. My experience is spot on with your examples.
           | 
           | > I've noticed that it's not obvious to most people that it's
           | mostly about knowing the domain well enough to ask the right
           | question(s). It's not magic, it won't think for you.
           | 
           | Absolutely right with the part of knowing the domain.
           | 
           | I do not entertain or care about the AI fantasies because
           | ChatGPT is extremely good at getting me other information. It
           | saves me from opening a new tab, formulating my query and
           | then hunting for the information. I can save that extra time
           | for what latest / relevant information I should grab from
           | Google.
           | 
           | Google is still in my back pocket for the last mile
           | verification and judgement. I am also skeptical of the
           | information ChatGPT throws out (such as old links). Other
           | than that, ChatGPT to me is as radical as putting the url and
           | search bar into one input. I just move faster with the
           | information.
        
           | joenot443 wrote:
           | I'd really love to hear more about your workplace use-case,
           | what kind of systems are we talking about here?
           | 
           | This is a way of using ChatGPT I haven't really seen before,
           | I'm really into it.
        
           | shrimpx wrote:
           | "Just don't use it" is not salient advice for non-technical
           | people who don't know how it works, and are misled by
           | basically dishonest advertising and product packaging. But
           | hopefully the market will speak, users at large will become
           | educated about its limits via publicized blunders, and these
           | products will be correctly delimited as "lies a lot but could
           | be useful if you are able/willing to verify what it says."
        
             | hughc wrote:
             | I think the original sentence was written more in of "Your
             | loss is my gain" competitive advantage vein. The real trick
             | is, as you say, to critically assess the output, and many
             | people are incapable of that.
        
           | acchow wrote:
           | I imagine your first example includes private industry
           | information that you are not allowed to divulge.
           | 
           | But your latter example about syntax... mind sharing that
           | ChatGPT conversation?
        
           | CPLX wrote:
           | I mean sure.
           | 
           | In other news I asked it to make a list of all the dates in
           | 2023 that were neither weekends nor US federal holidays and
           | it left Christmas Day on the list.
        
             | mashygpig wrote:
             | Yea, I think people hide "the magic smoke" by using complex
             | queries and then filling in the gaps of chatGPT's outputs
             | with their own knowledge, which then makes them overvalue
             | the output. Strip that away to simple examples like this
             | and it becomes more clear what's going on. (I think there
             | IS a lot of value for them in their current state because
             | they can jog your brain like this, just not to expect it to
             | know how to do everything for you. Think of it as the most
             | sophisticated rubber duck that we've made yet).
        
           | elorant wrote:
           | I too have a very positive experience. I ask specific
           | questions about algorithms and how technical projects work
           | and I enjoy its answers. They won't replace my need to visit
           | a real search engine neither I take them at face value. But
           | as a starting point for any research I think it's an amazing
           | tool. It's also quite good for marketing stuff, like writing
           | e-mails, cover letters, copy for your website, summarizing or
           | classifying text, and all language related stuff.
           | 
           | People think it's Cortana from Halo and ask existential
           | questions or they're trying to get it to express feelings.
           | 
           | I think the AI part on its presentation created too much
           | expectations of what it can do.
        
           | wpietri wrote:
           | > Yeah, I read this sentiment all the time and here's what I
           | always say - just don't use it. Leave it to the rest of us if
           | it's so wrong / off / bad.
           | 
           | If it were only a matter of private, individual usage, I'd be
           | fine with it. If that's all you're asking for, we can call it
           | a deal. But it isn't, is it?
        
           | Taurenking wrote:
           | [dead]
        
           | bambax wrote:
           | > _It seems like a LOT of people think it 's a replacement
           | for Google/search engines_
           | 
           | Well, that "lot" includes the highest levels of management
           | from Microsoft and Google, so maybe the CAPS are justified.
           | And the errors we're talking about here are errors produced
           | by said management during demos of their own respective
           | product. You would think they know how to use it "correctly".
        
             | whimsicalism wrote:
             | I'm going to let you in on a secret: managers, even high-
             | level ones, can be wrong - and indeed they frequently are.
        
               | bambax wrote:
               | Thanks for that unique insight.
               | 
               | But the question is, are they wrong in that they don't
               | know how to use / promote an otherwise good product, or
               | are they wrong because they are choosing to put forward
               | something that is completely ill-suited for the task?
        
           | westoncb wrote:
           | > Maybe it's more about using it to reason through your
           | problem in a dialog, and not just ask it something you can
           | google/duckduckgo.
           | 
           | Your experience with it sounds very similar to my own. It
           | exhibits something like on-demand precision; it's not a
           | system with some fundamental limit to clarity (like Ted
           | Chiang via his jpeg analogy, and others, have argued): it may
           | say something fuzzy and approximate (or straight up wrong) to
           | begin with but--assuming you haven't run into some corner
           | where its knowledge just bottoms out--you can generally just
           | tell it that it made a mistake or ask for it to
           | elaborate/clarify etc., and it'll "zoom in" further and
           | resolve fuzziness/incorrect approximation.
           | 
           | There is a certain very powerful type of intelligence within
           | it as well, but you've got to know what it's good at to use
           | it well: from what I can tell it basically comes down to it
           | being _very_ good at identifying  "structural similarity"
           | between concepts (essentially the part of cognition which is
           | rooted in analogy-making), allowing it to very effectively
           | make connections between disparate subject matter. This is
           | how it's able to effectively produce original work (though
           | typically it will be directed there by a human): one of my
           | favorite examples of this was someone asking it to write a
           | Lisp program that implements "virtue ethics"
           | (https://twitter.com/zetalyrae/status/1599167510099599360).
           | 
           | I've done a few experiments myself using it to formalize
           | bizarre concepts from other domains and its ability to
           | "reason" in both domains to make decisions about how to
           | formalize, and then generating formalizations, is very
           | impressive. It's not enough for me to say it is unqualifiedly
           | "intelligent", but it imo its ability to do this kind of
           | thing makes it clear why calling it a search engine, or
           | something merely producing interpolated averages (a la
           | Chiang), is so misleading.
        
           | YurgenJurgensen wrote:
           | Don't like chlorofluorocarbons or tetraethyllead? Just don't
           | use them.
        
         | wpietri wrote:
         | > We are so amazed by its ability to babble in a confident
         | manner
         | 
         | Sure, we shouldn't use AI for anything important. But can we
         | try running ChatGPT for George Santos's seat in 2024?
        
         | Madmallard wrote:
         | (1) is just simply wrong.
         | 
         | People with domain expertise in software are going to be
         | amplified 10x using ChatGPT and curating the results. Likewise
         | with any field that ChatGPT has adequate training data in.
         | Further models will be created that are more specialized to
         | specific fields that way their prediction model spews out
         | things that are much more sophisticated and useful
        
           | mrtranscendence wrote:
           | What, precisely, about (1) is "simply wrong"? You've made a
           | prediction about the usefulness of ChatGPT, but you haven't
           | described why it's wrong to analogize GPT-type models to the
           | language center of a brain.
        
             | Madmallard wrote:
             | "To put it in code assistant terms, I expect people to be
             | increasingly amazed at how well they seem to be coding,
             | until you put the results together at scale and realize
             | that while it kinda, sorta works, it is a new type of
             | never-before-seen crap code that nobody can or will be able
             | to debug short of throwing it away and starting over."
             | 
             | This part
        
           | vidarh wrote:
           | I think you're right. I noted on another thread that I got
           | ChatGPT to produce a mostly right DNS server in ~10 minutes
           | that it took me just a couple of corrections to make work.
           | 
           | It worked great for that task, because I've written a DNS
           | server before (a simple one) and I've read the RFCs, so it
           | was easy for me to find the few small bugs without resorting
           | to a line by line cross-check with a spec that might have
           | been unfamiliar to others.
           | 
           | I expect using it to spit out boilerplate for things you
           | could do just as well yourself will be a lot more helpful
           | than using it to try to avoid researching new stuff (though
           | you might well be able to use it to help summarise and
           | provide restatements of difficult bits to speed up your
           | research/learning as well).
        
             | soiler wrote:
             | In what way is this development loop:
             | 
             | 1. Read technology background thoroughly
             | 
             | 2. Read technology documentation thoroughly
             | 
             | 3. Practice building technology
             | 
             | 4. Ask ChatGPT to create boilerplate for basic
             | implementation
             | 
             | 5. Analyze boilerplate for defects
             | 
             | 10x fast than this development loop:
             | 
             | 1. Read technology background thoroughly
             | 
             | 2. Read technology documentation thoroughly
             | 
             | 3. Practice building technology
             | 
             | 4. Manually create boilerplate for basic implementation
             | 
             | 5. Analyze boilerplate for defects
        
               | Madmallard wrote:
               | For new technologies coming out it won't be effective
               | until newer models are made.
               | 
               | Notice how I said it's going to make developers with
               | existing domain knowledge faster.
               | 
               | But even to your point, I've never used Excel VBA before
               | and I had ChatGPT generate some VBA macros to move data
               | with specific headers and labels from one sheet to
               | another and it wrote a script to do exactly that for me
               | in ~1 minute, and just reading what it wrote it's
               | immediately helping me clearly understand how it works.
               | The scripts also work.
               | 
               | The computer science and server infrastructure technology
               | fundamental background is what matters. Then the
               | implementations will be quickly understandable by those
               | that use it.
               | 
               | I asked it to make a 2D fighting game in Phaser 3 and
               | specified what animations it will be using, the controls
               | each player will have, the fact that there's a background
               | with X name, what each of the moves do to the momentum of
               | each player, and the type of collisions it will do and it
               | spat out something in ~15 minutes (mainly because of all
               | the 'continue' commands I had to give) that gets all the
               | major bullet points right and I just have to tweak it a
               | bit to make it functional. The moves are simplified of
               | course but uhh yeah. This is kinda insane. I think you
               | can be hyper specific about even complex technology and
               | as long as there has been good history of it online in
               | github and stack overflow and documentation it will give
               | you something useful quickly.
               | 
               | https://www.youtube.com/watch?v=pspsSn_nGzo Here's a
               | perspective from a guy that used to work at Microsoft on
               | every version of windows from the beginning to XP.
        
               | vidarh wrote:
               | It isn't. My exact _point_ was that it _isn 't_ and
               | accordingly ChatGPT produces the best benefits for
               | someone who has _already_ done 1, 2, 3 for a given
               | subject.
               | 
               | It was in agreement with the comment above that suggest
               | people _with domain expertise_ will be faster with it.
               | 
               | In those cases, ChatGPT will do 4 far faster, and 5 will
               | be little different.
        
             | tsimionescu wrote:
             | How often has the solution to a business problem you faced
             | been "write a simple DNS server"? Or are you claiming that
             | it produced a fully featured and world-scale fast DNS
             | server?
        
               | vidarh wrote:
               | Several times. If that was the only thing I got it to do
               | it wouldn't be very interesting, but that it answered the
               | first problem I threw at it and several subsequent
               | expansions with quite decent code was.
               | 
               | Writing a "world-scale fast DNS server" is a near trivial
               | problem if what you look up _in_ is fast to query. Most
               | people don 't know that, because most people don't know
               | how simple the protocol is. As such it's surprisingly
               | versatile. E.g. want to write a custom service-discovery
               | mechanism? Providing a DNS frontend is easy.
               | 
               | How that domain knowledge interacts with ChatGPT's
               | "mostly right" output was the point of my comment, not
               | specifically a DNS server. If you need to implement
               | something you know well enough, odds are ChatGPT can
               | produce a reasonable outline of it that is fast for
               | someone who already knows the domain well enough to know
               | what is wrong with, and what needs to be refined.
               | 
               | E.g. for fun I asked it right now to produce a web server
               | that supports the Ruby "Rack" interface that pretty much
               | all Ruby frameworks supports. It output one that pretty
               | much would work, but had plenty of flaws that are obvious
               | to anyone versed in the HTTP spec (biggest ones: what it
               | output was single threaded, and the HTTP parser is too
               | lax). As a starting point for someone unaware of the spec
               | it'd be awful, because they wouldn't know what to look
               | for. As a starting point for someone who has read the
               | spec, it's easy enough to ask for refinements ("split the
               | request parsing from the previous answer into a separate
               | method"; "make the previous answer multi-threaded" - I
               | tried them; fascinatingly, when I asked it to make it
               | multi-threaded it spit out a better request parsing
               | function, likely because it then started looking more
               | like Rack integrations it's "seen" during training; it
               | ran on the first try, btw. and served up requests just
               | fine).
               | 
               | EDIT: Took just "Make it work with Sinatra" followed by
               | fixing a tiny issue by asking to "Add support for
               | rack.input" to get to a version that could actually serve
               | up a basic Sinatra app.
        
           | tsimionescu wrote:
           | Expertise in software is about understanding the problem
           | domain, understanding the constraints imposed by the
           | hardware, understanding how to translate business logic to
           | code. None of these are significantly helped by AI code
           | assistance, as they currently exist. The AI only helps with
           | the coding part, usually helping generate boilerplate
           | tailored to your code. That may help 1.1x your productivity,
           | but nowhere near 10x.
        
             | broast wrote:
             | I'm surprised you haven't been able to leverage the AI for
             | the analysis of a problem domain and constraints in order
             | to engineer a novel solution. This is generally what I use
             | it for, and not actual code generation.
        
           | mariusor wrote:
           | Domain knowledge resolves into intuition about solving
           | particular types of problems. All ChatGPT can do about that
           | is offer best guess approximations of what is already out
           | there in the training corpus. I doubt very much that this
           | exercise is anything but wasted time, so I think that people
           | with domain knowledge (in a non trivial domain) are using
           | ChatGPT instead of applying that knowledge, they are
           | basically wasting time 10x not being more productive.
        
           | jerf wrote:
           | I expect ChatGPT to be in a sort of equivalent of the uncanny
           | valley, where any professional who gets to the point that
           | they _can_ routinely use it will also be in a constant war
           | with their _own_ brain to remind it that the output must be
           | carefully checked. In some ways, the 99.99% reliable process
           | used at scale is more dangerous than the 50% reliable
           | process; everyone can see the latter needs help. It 's the
           | former where it's so very, very tempting to just let it go.
           | 
           | I'm not saying ChatGPT is 99.99% reliable, just using some
           | numbers for concreteness.
           | 
           | If you were setting out to design an AI that would slip the
           | maximum amount of error into exactly the places human brains
           | don't want to look, it would look like ChatGPT. You can see
           | this in the way that as far as I know, literally _all_ the
           | ads for GPT-like search technologies included significant
           | errors in their _ad copy_ , which you would _think_ everyone
           | involved would have every motivation to error check. This is
           | not merely a  "ha ha, silly humans" story... this _means
           | something_. In a weird sort of way it is a testament to the
           | technology... no sarcasm! But it makes it _dangerous_ for
           | human brains.
           | 
           | Human brains are machines for not spending energy on
           | cognitive tasks. They are very good at it, in all senses of
           | the phrase. We get very good bang-for-the-buck with our
           | shortcuts in the real world. But GPT techs are going to make
           | it really, really easy to not spend the energy to check after
           | a little while.
           | 
           | This is a known problem with human brains. How many people
           | can tell the story of what may be the closest human
           | equivalent, where they got some intern, paid a ton of
           | attention to them for the first two weeks, got to the point
           | where they flipped the "OK they're good now" bit on them, and
           | then came back to a complete and utter clusterfuck at the end
           | of their internship because the supervisor got "too lazy"
           | (although there's more judgment in that phrase than I like,
           | this is a brain thing you couldn't survive without, not just
           | "laziness") to check everything closely enough? They may even
           | have been glancing at the PRs the whole time and only put
           | together how bad the mess is at the end.
           | 
           | I'm not going to invite a technology like this into my life.
           | The next generation, we'll see when it gets here. But GPT is
           | very scary because its in the AI uncanny valley... very good,
           | very good at hiding the problems from human brains, and not
           | quite good enough to actually do the job.
           | 
           | And you know, since we're not talking theory here, we'll be
           | running this experiment in the real world. You use ChatGPT to
           | build your code, and I won't. You and I personally of course
           | won't be comparing notes, but as a group, we sure will be. I
           | absolutely agree there will be a point where ChatGPT _seems_
           | to be pulling ahead in the productivity curve in a short
           | term, but I predict that won 't hold and it will turn net
           | negative at some point. But I don't _know_ right now, any
           | more than you do. We can but put our metaphorical money down
           | and see how the metaphorical chips fall.
        
             | yunwal wrote:
             | The question I have is whether the tools to moderate
             | ChatGPT and correct its' wrong answers should be in place
             | for humans anyway. It's not like human workers are 100%
             | reliable processes, and in some cases we scale human work
             | to dangerous levels.
             | 
             | Ultimately, the best way to make sure an answer is correct
             | is to come to it from multiple directions. If we use GPT
             | and other AI models as another direction it seems like a
             | strict win to me.
        
               | pixl97 wrote:
               | Robert Miles recently did a video on this and even that
               | may not be enough. This appears to be a really hard
               | problem.
               | 
               | https://www.youtube.com/watch?v=w65p_IIp6JY
        
         | bccdee wrote:
         | > So people are going to be even more blindsided when someone
         | develops an AI that uses GPT as its language comprehension
         | component
         | 
         | I don't think that would work, because GPT doesn't actually
         | comprehend anything. Comprehension requires deriving meaning,
         | and GPT doesn't engage with meaning at all. It predicts which
         | word is most likely to come next in a sequence, but that's it.
         | 
         | What I think we'd be more likely to end up with is something
         | GPT-esque which, instead of simply generating text, transforms
         | English to and from a symbolic logic language. This logic
         | language would be able to encode actual knowledge and ideas,
         | and it would be used by a separate, problem-solving AI which is
         | capable of true logic and analysis--a true general AI.
         | 
         | The real question, IMO, is if we're even capable of producing
         | enough training data to take such a problem-solving AI to a
         | serious level of intelligence. Scenarios that require genuine
         | intelligence to solve likely require genuine intelligence to
         | create, and we'd need a _lot_ of them.
        
           | jerf wrote:
           | I think if you could somehow examine the output of your
           | language model in isolation, you would find it also doesn't
           | "comprehend". Comprehension is what we assign to our higher
           | level cognitive models. It is difficult to introspectively
           | isolate your own language center, though.
           | 
           | I took a stab at an exercise that may allow you to witness
           | this within your own mind here:
           | https://www.jerf.org/iri/post/2023/streampocalypse-and-
           | first... Don't know if it works for anyone but me, of course,
           | but it's at least an attempt at it.
        
             | jprete wrote:
             | I think you put the wrong link? https://www.jerf.org/iri/po
             | st/2023/understanding_gpt_better/ maybe?
        
               | jerf wrote:
               | Yes, you are correct. Oops. Too late to correct.
        
           | hackinthebochs wrote:
           | >Comprehension requires deriving meaning, and GPT doesn't
           | engage with meaning at all. It predicts which word is most
           | likely to come next in a sequence, but that's it.
           | 
           | Why think that "engaging with meaning" is not in the
           | solution-space of predicting the next token? What concept of
           | meaning are you using?
        
             | youssefabdelm wrote:
             | I get what you mean here but they probably mean referential
             | meaning... having never seen a dog, GPT doesn't really know
             | what a dog is on a physical level, just how that word
             | relates to other words.
        
               | fassssst wrote:
               | How do blind people know what a dog is?
        
               | youssefabdelm wrote:
               | Probably by hearing, touch, etc. - my point is some
               | stimulus from reality, doesn't have to be any of our
               | senses, just some stimulus.
               | 
               | Language is just symbols that stand for a stimulus (in
               | the best case)
        
         | lr4444lr wrote:
         | This is an excellent perspective.
        
         | insane_dreamer wrote:
         | > it's pretty clear that GPT is producing an amazing level of
         | comprehension of what a series of words means. The problem is,
         | that's all it is really doing.
         | 
         | very key point
        
         | jodrellblank wrote:
         | > " _We are so amazed by its ability to babble in a confident
         | manner_ "
         | 
         | But we do this with people - religious leaders, political
         | leaders, 'thought' leaders, venture capitalists, story tellers,
         | celebrities, and more - we're enchanted by smooth talkers, we
         | have words and names for them - silver tongued, they have the
         | gift of the gab, slick talker, conman, etc. When a marketing
         | manager sells a CEO on cloud services, and neither of them know
         | what cloud services are, you can argue that it should matter
         | but it doesn't actually seem to matter. When a bloke on a
         | soapbox has a crowd wrapped around their finger, everyone goes
         | home after and the most common result is that the feeling fades
         | and nothing changes. When two people go for lunch and one asks
         | 'what's a chicken fajita?' and the other says 'a Spanish potato
         | omelette' and they both have a bacon sandwich and neither of
         | them check a dictionary, it doesn't _matter_.
         | 
         | Does it matter if Bing Chat reports Lululemon's earnings
         | wrongly? Does it matter if Google results are full of SEO spam?
         | It "should" matter but it doesn't seem to. Who is interested
         | enough in finances to understand the difference between "the
         | unadjusted gross margin" and "The gross margin adjusted for
         | impairment charges" and the difference matters to them, and
         | they are relying exclusively on Bing Chat to find that out, and
         | they can't spot the mistake?
         | 
         | I suspect that your fears won't play out because most of us go
         | through our lives with piles of wrong understanding which
         | doesn't matter in the slightest - at most it affects a trivia
         | quiz result at the pub. People with life threatening allergies
         | take more care than 'what their coworker thinks is probably
         | safe'. We're going to have ChatGPT churn out plausible sounding
         | marketing material which people don't read. If people do read
         | it and call, the call center will say "sorry that's not right,
         | yes we had a problem with our computer systems" and that
         | happens all the time already. Some people will be
         | inconvenienced, some businesses will suffer some lost income,
         | society is resilient and will overall route around damage, it
         | won't be the collapse of civilisation.
        
           | guhidalg wrote:
           | I'm waiting for the legal case that decides if AI generated
           | content is considered protected speech or not.
        
           | pixl97 wrote:
           | > When a bloke on a soapbox has a crowd wrapped around their
           | finger, everyone goes home after and the most common result
           | is that the feeling fades and nothing changes.
           | 
           | I mean, until the crowd decides to follow the bloke and the
           | bloke says "Lets kill all the ____" and then we strike of a
           | new world war...
        
         | golem14 wrote:
         | I wonder how useful gpt could be to research brain injuries
         | where the logic or language centers are damaged individually .
         | . .
        
         | c3534l wrote:
         | While I agree with everything you've said, I also see that
         | steady, incremental progress is being made, and that as we
         | identify problems, we're able to fix it. I also see lots of
         | money being thrown at this and enough people finding genuine
         | niche uses for this that I see it continuing on. Wikipedia was
         | trash at first, as were so many other technologies. But there
         | was usually a way to slowly improve it over time, early
         | adopters to keep the cash flowing, identifiable problems with
         | conventional solutions, etc.
        
         | jedbrown wrote:
         | > it's pretty clear that GPT is producing an amazing level of
         | comprehension of what a series of words means
         | 
         | It comprehends nothing at all. It's amazing at constructing
         | sequences of words to which human readers ascribe meaning and
         | perceive to be responsive to prompts.
        
         | theptip wrote:
         | > GPT is basically the language portion of your brain. The
         | language portion of your brain does not do logic. It does not
         | do analyses.
         | 
         | I like this analogy as a simple explanation. To dig in though,
         | do we have any reason to think we can't teach a LLM better
         | logic? It seems it should be trivial to generate formulaic
         | structured examples that show various logical / arithmetic
         | rules.
         | 
         | Am I thinking about it right to envision that a deep NN has
         | free parameters to create sub-modules like a "logic region of
         | the brain" if needed to make more accurate inference?
        
           | jerf wrote:
           | "To dig in though, do we have any reason to think we can't
           | teach a LLM better logic?"
           | 
           | Well, one reason is that's not how our brains work. I won't
           | claim our brains are the one and only way things can work,
           | there's diversity even within human brains, but it's at least
           | a bit of evidence that it is not preferable. If it were it
           | would be an easier design than what we actually have.
           | 
           | I also don't think AIs will be huge undifferentiated masses
           | of numbers. I think they will have structure, again, just as
           | brains do. And from that perspective, trying to get a
           | language model to do logic would require a multiplicatively
           | larger langauge model (minimum, I _really_ want to say
           | "exponentially" but I probably can't justify that... that
           | said, O(n^2) for n = "amount of math understood" is probably
           | not out of the range of possibility and even that'd be a real
           | kick in the teeth), whereas adjoining a dedicated logic
           | module to your language model will be quite feasible.
           | 
           | AIs can't escape from basic systems engineering. Nothing in
           | our universe works as just one big thing that does all the
           | stuff. You can always find parts, even in biology. If
           | anything, our discipline is the farthest exception in that we
           | can build things in a fairly mathematical space that can end
           | up doing all the things in one thing, and we consider that a
           | _serious_ pathology in a code base because it 's still a bad
           | idea even in programming.
        
             | theptip wrote:
             | This all matches my intuition as a non-practitioner of ML.
             | However, isn't a DNN free to implement its own structure?
             | 
             | Or is the point you're making that full connectivity (even
             | with ~0 weights for most connections) is prohibitively
             | expensive and a system that prunes connectivity as the
             | brain does will perform better? (It's something like 1k
             | dendrites per neuron max right?)
             | 
             | The story of the recent AI explosion seems to be the
             | surprising capability gains of naive "let back-prop figure
             | out the structure" but I can certainly buy that
             | neuromorphic structure or even just basic modular
             | composition can eventually do better.
             | 
             | (One thought I had a while ago is a modular system would be
             | much more amenable to hardware acceleration, and also to
             | interpretability/safety inspection, being a potentially
             | slower-changing system with a more stable "API" that other
             | super-modules would consume.)
        
           | probably_wrong wrote:
           | > _do we have any reason to think we can't teach a LLM better
           | logic?_
           | 
           | I'll go for a pragmatic approach: the problem is that there
           | is no data to teach the models cause and effect.
           | 
           | If I say "I just cut the grass" a human would understand that
           | there's a world where grass exists, it used to be long, and
           | now it is shorter. LLMs don't have such a representation of
           | the world. They could have it (and there's work on that) but
           | the approach to modern NLP is "throw cheap data at it and see
           | what sticks". And since nobody wants to hand-annotate massive
           | amounts of data (not that there's an agreement on how you'd
           | annotate it), here we are.
        
             | pixl97 wrote:
             | I call this the embodiment problem. The physical
             | limitations of reality would quickly kill us if we didn't
             | have a well formed understanding of them. Meanwhile AI is
             | stuck in 'dream mode', much like when we're dreaming we can
             | do practically anything without physical consequence.
             | 
             | To achieve full AI I believe will eventually have to our
             | AI's have a 'real world' set of interfaces to bounds check
             | information.
        
       | kornhole wrote:
       | I already had a trust issue with these 'authoritative' search
       | engines and however they are configured to deliver the results
       | they want me to see. ChatGPT makes the logic even more opaque. I
       | am working harder now to make my Yacy search engine instance more
       | performative. This is a decentralized search engine run by the
       | node operators instead of centralized authorities. This seems to
       | be our best hope to avoid the problem of "He controls the past
       | controls the future."
        
       | airstrike wrote:
       | I mean, it's in beta and it's not really intelligent despite the
       | cavalier use of the term AI these days
       | 
       | It's just a collage of random text that sorta resembles what
       | someone would say, but it has no commitment to being _truthful_
       | because it has no actual appreciation for what information it is
       | relaying, parroting or conveying.
       | 
       | But yeah, I agree Google got way more hate for their failed demo
       | than MS... I don't even understand why. Satya Nadella's did a
       | great job conveying the excitement and general bravado on his
       | interview on CBS News[1] but the accompanying demo was littered
       | with mistakes. The reporter called it out, yet coverage on the
       | press has been very one-sided against Google for some reason.
       | First mover advantage, I suppose?
       | 
       | ----------
       | 
       | 1. https://www.cbsnews.com/news/microsoft-ceo-satya-nadella-
       | new...
        
         | salt-thrower wrote:
         | I would guess that the average person has higher expectations
         | for Google. Bing has been a bit of a punchline for years, so I
         | don't think most people care as much.
        
         | Mountain_Skies wrote:
         | As far as I know Microsoft's CEO hasn't done a demo that went
         | wrong like happened with Google. So far, from what I've seen,
         | it is users testing Bing to find errors. The outcome, that
         | they're both giving poor results, is the same, but with a
         | company CEO and a live demo involved, it's always going to get
         | more attention than someone on Reddit putting the product
         | through its paces and finding it lacking.
         | 
         | >A Microsoft executive declined CBS News' request to test some
         | of those mechanisms, indicating the functionality was "probably
         | not the best thing" on the version in use for the
         | demonstration.
         | 
         | Microsoft apparently isn't acting from a position of panic, so
         | they have been savvier with how they've presented their product
         | to the media and the world. Google panicked and set their CEO
         | up for embarrassment.
        
           | airstrike wrote:
           | > As far as I know Microsoft's CEO hasn't done a demo that
           | went wrong like happened with Google.
           | 
           | Right before the interview, the reporter was testing it out
           | with the Bing AI project manager (I think, can't recall her
           | exact role) and it was giving driving directions to places
           | that were either in the wrong direction or to an entirely
           | made up location
        
           | JackC wrote:
           | > As far as I know Microsoft's CEO hasn't done a demo that
           | went wrong like happened with Google.
           | 
           | Close enough -- the parent article we're discussing is about
           | errors in screenshots from a demo by Yusuf Mehdi, Corporate
           | Vice President & Consumer Chief Marketing Officer for
           | Microsoft. The first screenshot appears in the presentation
           | at 24:30.
        
         | LesZedCB wrote:
         | because people see it as a David and Goliath, even though that
         | characterization is comically inaccurate
        
       | visarga wrote:
       | The potential for being sued for libel is huge. It's one thing to
       | say the height of Everest wrong, another to falsely claim that a
       | vacuum has a short cord, or that a company had 5.9% operating
       | margin instead of 4.6%.
        
         | layer8 wrote:
         | Yep, it will be interesting to see how the legal liability
         | aspect will play out.
        
           | egillie wrote:
           | It might actually be smart of google to let microsoft take
           | the brunt of this first...
        
         | eatsyourtacos wrote:
         | I don't see how this can be true at all in the search engine
         | context or even chatGPT where you are _asking_ for information
         | and getting back a result which may or may not be true.
         | 
         | It would be different if an AI is independently creating and
         | publishing an article that has false information.. but that's
         | not the case. You are asking a question and it's giving it's
         | best answer.
         | 
         | I'm not a lawyer by any means, so someone please give a more
         | legal distinction here. But if you asked _me_ what the
         | operating margin of company X was, and I give you an answer
         | (whether I make it up or compute it incorrectly), you or the
         | company can 't sue me (and win) for libel or anything of the
         | sort.
         | 
         | So I'm not sure the potential is as big as you think it is..
         | that's like saying before any AI you can sue google because
         | they return you a search result which has a wrong answer, or
         | someone just making shit up. That's not on them- it's literally
         | indexing data and doing the best it's algorithm can do.
         | 
         | It would only be on the AI if you are literally selling the use
         | of the AI in some context where you are basically assuring it's
         | results are 100% accurate, and people are depending on it as
         | such (there is probably some legal term for this, no idea what
         | it is).
        
           | adamckay wrote:
           | > But if you asked me what the operating margin of company X
           | was, and I give you an answer (whether I make it up or
           | compute it incorrectly), you or the company can't sue me (and
           | win) for libel or anything of the sort.
           | 
           | If your answer damages company X then they can sue you. If
           | you Tweet that a vacuum cleaner is terrible because its noisy
           | to your 4 followers it's probably not a big deal as (under UK
           | law, and I'm assuming similar internationally) a company has
           | to prove "serious financial loss". If you write about it on
           | your Instagram that has millions of followers then that's
           | more of an issue, so you can assume a search engine claiming
           | to summarise results but apparently hallucinating and making
           | things up is liable for a defamation suit if it can be
           | demonstrated to harming the company.
        
             | cwkoss wrote:
             | "terrible" and "noisy" are both largely subjective, so it
             | would be very hard to bring a defamation suit in the US
             | over those claims.
        
           | crazygringo wrote:
           | > _But if you asked me what the operating margin of company X
           | was, and I give you an answer (whether I make it up or
           | compute it incorrectly), you or the company can 't sue me
           | (and win) for libel or anything of the sort._
           | 
           | If you're a popular website and you intentionally publish an
           | article where you state an incorrect answer that many people
           | follow and make investment decisions about, the company
           | _absolutely_ can sue you and win.
           | 
           | In the courts, it will ultimately come down to to what extent
           | Microsoft is knowingly disseminating misinformation in a
           | context that users expect to be factually accurate,
           | regardless of supposed disclaimers.
           | 
           | If Microsoft is leading users to believe that Bing Chat is
           | accurate and chat misinformation winds up actually affecting
           | markets through disinformation, there's gigantic legal
           | liability for this. Plus the potential for libel is enormous
           | regarding statements made about public figures and
           | celebrities.
        
             | eatsyourtacos wrote:
             | >you intentionally publish an article where you state an
             | incorrect answer that many people follow and make
             | investment decisions about, the company absolutely can sue
             | you and win.
             | 
             | I _literally_ said that in my post!
             | 
             | But then I said if you _asked_ me, it 's different.
             | 
             | You are ASKING the AI to give you it's best answer. That is
             | a million times different than literally publishing an
             | article that people should assume to be factual.
             | 
             | >If Microsoft is leading users to believe that Bing Chat is
             | accurate
             | 
             | But they aren't, and never will be. So you are basically
             | just making things up in your head for argumentative
             | purposes. There are going to be disclaimers up the wazoo to
             | easily protect them. Partly because, as I keep trying to
             | tell you, it's much different when you ASK a question and
             | they give you an answer rather than publishing something to
             | the public where it's implied that it's been independently
             | fact checked etc.
        
               | crazygringo wrote:
               | Right, but the distinction of "asking" isn't a legal one
               | I'm aware of. I don't think it matters. If 100,000 people
               | "ask" the same question on Bing and get the same
               | inaccurate result, what's the difference between that and
               | publishing a fact that gets seen by 100,000 people? There
               | isn't one.
               | 
               | And Microsoft needs to tread a very fine line between
               | "use our useful tool!" and "our tool is false!" Which I'm
               | not sure will be possible legally, and is probably why
               | Google has been holding back. Bing is clearly intended
               | for information retrieval, not for generating fictional
               | results "for entertainment purposes only", and
               | disclaimers aren't as legally watertight as you seem to
               | think they are.
        
               | eatsyourtacos wrote:
               | >I don't think it matters. If 100,000 people "ask" the
               | same question on Bing and get the same inaccurate result,
               | what's the difference between that and publishing a fact
               | that gets seen by 100,000 people? There isn't one.
               | 
               | Of course there is a difference.
               | 
               | Publishing an article is literal _intent_. The premise is
               | you researched or have knowledge on a topic, you write
               | it, you fact check it, and it 's put out there for people
               | to see.
               | 
               | An AI which consumes a bunch of data and then tries to be
               | able to respond to an infinite number of questions has no
               | _intention_ of harm doing and you can 't even call it
               | gross negligence. It's not being negligent- it's doing
               | exactly what it's supposed to do.. it might just be
               | wrong.
               | 
               | I'm not sure in what universe you think those are the
               | same thing.
               | 
               | Now if I ask the AI to write a paper about the forecast
               | of a company, and I just take the result and put it into
               | a newspaper where it's assumed it's been fact checked and
               | all that, sure that's completely different.
               | 
               | >disclaimers aren't as legally watertight as you seem to
               | think they are
               | 
               | I guess you know more than Microsoft's lawyers. I'm sure
               | they didn't think about this at all when releasing it....
        
               | crazygringo wrote:
               | > _has no intention of harm doing and you can 't even
               | call it gross negligence_
               | 
               | You certainly can call it gross negligence if Microsoft
               | totally ignored the likely outcome that people would come
               | to harm because they would reasonably interpret its
               | answers as true.
               | 
               | The intent here is with Microsoft releasing this at all,
               | not intent on any specific answer.
               | 
               | > _I 'm not sure in what universe you think those are the
               | same thing._
               | 
               | I think many users in this universe will just ask Bing a
               | question and think it's providing factual answers, or at
               | least answers sourced from a website, and not just
               | invented out of whole cloth.
               | 
               | > _I guess you know more than Microsoft 's lawyers._
               | 
               | No, I was pointing out that _Google_ seemed to be
               | treading more cautiously (the law here as clearly yet to
               | be tested), and that the disclaimers _you_ were proposing
               | aren 't 100% ironclad.
               | 
               | Anyways, I was just trying to answer your question on how
               | Microsoft might be sued for libel. But for some reason
               | you're attacking me, claiming I'm "making things up in my
               | head" and that I "know more than Microsoft lawyers". So
               | I'm not going to explain anything else. I've given clear
               | explanations as to how this is a legal gray area, but you
               | don't seem interested.
        
       | adamsmith143 wrote:
       | This always strange to me. Bing search ALREADY couldn't be
       | trusted. What, are people searching something on a search engine
       | and blindly trusting the first result with 100% certainty? Do
       | these people really exist outside of Q-anon cults?
        
       | theodorejb wrote:
       | The problem with Artificial "Intelligence" is that it really has
       | no intelligence at all. Intelligence requires understanding, and
       | AI doesn't understand either the data fed into it or the
       | responses it gives.
       | 
       | Yet because these tools output confident, plausible-sounding
       | answers with a professional tone (which may even be correct a
       | majority of the time), they give a strong illusion of being
       | reliable.
       | 
       | What will be the result of the current push of GPT AI into the
       | mainstream? If people start relying on it for things like
       | summarizing articles and scientific papers, how many wrong
       | conclusions will be reached as a result? God help us if doctors
       | and engineers start making critical decisions based on generative
       | AI answers.
        
         | danans wrote:
         | > What will be the result of the current push of GPT AI into
         | the mainstream? If people start relying on it for things like
         | summarizing articles and scientific papers, how many wrong
         | conclusions will be reached as a result? God help us if doctors
         | and engineers start making critical decisions based on
         | generative AI answers.
         | 
         | On the other hand, it may end up completely undermining its own
         | credibility, and put a new premium on human sourced
         | information. I can see 100% human-sourced being a sort of
         | premium label on information in the way that we use "pesticide-
         | free" or "locally-sourced" labels today.
        
           | gptgpp wrote:
           | Nice! This would make for a super fun sci-fi...
           | 
           | The poors that need medicine get put in front of an LLM that
           | gets it right most of the time, if they're lucky enough to
           | have a common issue / symptomatic presentation.
           | 
           | Hey, when you're poor, you can't afford a one-shot solution!
           | You gotta put up with a many-shot technique.
           | 
           | Meanwhile the rich people get an actual doctor that can use
           | sophisticated research and medical imaging. Kindly human
           | staff with impeccable empathy and individualized
           | consideration -- the sort of thing only money can buy.
        
       | xyzelement wrote:
       | I may be an unusual audience but something I've appreciated about
       | these models is their ability to create unusual synthesis from
       | seemingly unrelated sources. It's like if a scientist read up on
       | many unrelated fields, got super high and started thinking of the
       | connections between these fields.
       | 
       | Much of what they would produce might just be hallucinations, but
       | they are sort of hallucinations informed by something that's
       | possible. At least in my case, I would much rather then parse
       | through that and throw out the bullshit, but keep the gems.
       | 
       | Obviously that's a very different use case than asking this thing
       | the score of yesterday's football game.
        
         | TSiege wrote:
         | Got any good examples?
        
       | mucle6 wrote:
       | Question for HN. Do you trust search engines for open ended /
       | opinion questions?
       | 
       | For example, I trust Google for "Chocolate Cake Recipe", but not
       | "What makes a Chocolate Cake Great?"
       | 
       | I would love it if Search Engines (with or without AI) could
       | collect different "schools of thought" and the reasoning behind
       | them so I could choose one.
        
         | Hamcha wrote:
         | I just add "reddit" at the end of any query of sort and the
         | results get 100x better instantly. It's a flawed approach but I
         | feel normal searches are plagued by overly specific websites
         | (wouldnt be surprised if chocolatecakerecipes.com exists) with
         | lowly paid people to just be human ChatGPTs so they can fill
         | articles with ads and affiliate links
        
         | layer8 wrote:
         | I only trust search engines to list vaguely relevant links.
         | Then peruse those. Form your own opinion.
         | 
         | > collect different "schools of thought" and the reasoning
         | behind them
         | 
         | The thing is, if an AI can accurately present the reasoning
         | behind them, then it could also accurately present facts in the
         | first place (and not present fabulations). But we don't seem to
         | be very close to that capability. Which means you couldn't
         | trust the presented reasoning either, or that the listed
         | schools of thought actually exist and aren't missing a relevant
         | one.
        
       | tastyminerals2 wrote:
       | I played with dev Edge version which was updated today with a
       | chat feature. I was impressed by how well it can write abstract
       | stuff or summarize over data by making bullet points. Trying
       | drilling down to concrete facts or details, makes it struggle and
       | mistakes do appear. So, we don't go there.
       | 
       | On a bright side, asking it recipes of sauces for salmon steak is
       | not a bad experience at all. It creates you a list, filters it
       | and then can help you pick out the best recipe. And this is
       | probably the most frequent use case for me on a daily basis.
        
       | greenflag wrote:
       | Likely going to be a wave of research/innovation "regularizing"
       | LLM output to conform to some semblance of reality or at least
       | existing knowledge (e.g. knowledge graph). Interesting to see how
       | this can be done quickly enough...
        
         | visarga wrote:
         | Probably the hottest research trend in 2023. LLMs are worthless
         | unless verified.
        
           | whimsicalism wrote:
           | Really? I already get a huge amount of value out of LLMs even
           | if they hallucinate.
           | 
           | Or is this just HN tendency towards hyperbole?
        
             | visarga wrote:
             | Interesting, care to give an example? Exclude fiction,
             | imagination and role playing, where hallucination is
             | actually a feature.
        
         | [deleted]
        
         | kneebonian wrote:
         | > Likely going to be a wave of research/innovation
         | "regularizing" LLM output to conform to some semblance of
         | reality or at least existing knowledge
         | 
         | This is a much more worrying possiblity, as there are many
         | people who have at this point chosen to abandoned reality for
         | "their truth" and push ideas that objective facts are inferior
         | to "lived experiences". This is a much bigger concern around AI
         | in my mind.
         | 
         | "The Party told you to reject the evidence of your eyes and
         | ears. It was their final, most essential command." -- George
         | Orwell, 1984
        
           | vore wrote:
           | As fun as quoting 1984 is, there is a huge gap between that
           | and just not making up the winner of the Super Bowl so
           | confidently.
        
         | mvcalder wrote:
         | It will be interesting to see what insights such efforts spawn.
         | For the most part LLMs specifically, and deep networks more
         | generally, are still black boxes. If we don't understand (at a
         | deep level) how they work, getting them to "conform to some
         | semblance of reality" feels like a hard problem. Maybe just as
         | hard as language understanding generally.
        
       | scrose wrote:
       | I understand the current hype-cycle around AI is pitching it as
       | some all-knowing Q & A service, but I think we'd all be a bit
       | happier if we instead thought of it more as just another tool to
       | get ideas from that we still ultimately need to research for
       | ourselves.
       | 
       | Using the Mexico example in the article, I think the answer there
       | was fine for a question about nightlife. As someone whose never
       | been to Mexico, getting a few names of places to go sounds nice,
       | and the first thing I'd do after getting that answer is look up
       | locations, reviews(across different sites), etc... and use the
       | initial response as a way to _plan_ my next steps, not just take
       | the response at face value.
       | 
       | I'm currently dabbling with and treating ChatGPT similarly -- I
       | ask it for options and ideas when I'm facing a mental block, but
       | not asking it for definitive answers to the problems I'm facing.
       | As such, it feels like a slight step above rubber-ducking, which
       | I'm personally happy enough with.
        
       | bigmattystyles wrote:
       | Hopefully the fact that ChatGPT / BingAI can generate inaccurate
       | statements but sound incredibly confident will lead more and more
       | people to question all authority. If you think ChatGpt can swing
       | BS and yet sound confident, and believe that's new, let me
       | introduce you to modern religious leaders, snake oil salesmen,
       | many government reps, NFT and crypto peddlers. I still think
       | ChatGpt is amazing. It may suffer from GIGO, it'd be nice if it
       | was better at detecting GI so as not to generate GO, I'm
       | confident it can get better. Nevertheless, it's a tool that
       | abstracts you from many things, like most other things that are
       | blackboxes, it's good to question.
        
       | perrohunter wrote:
       | Why are we not rooting for the search underdog? When google owns
       | 92%+ of the search market, any competition should be welcomed
        
         | VWWHFSfQ wrote:
         | Are suggesting that we should root for and accept blatantly
         | misleading, false, and probably harmful search results just
         | because they're the "underdog"
        
           | visarga wrote:
           | Waiting for GPT-4 to take over.
        
         | weberer wrote:
         | Its weird that I always see this exact comment whenever
         | Microsoft is trying to break in to a market, but I never see it
         | when its any other company.
        
         | aabhay wrote:
         | Yes, Microsoft the poor underdog.
        
           | Mountain_Skies wrote:
           | Duopolies are bad but not quite as bad as a monopoly.
        
         | nivenkos wrote:
         | GPT3 isn't search.
        
       | Shank wrote:
       | Before the super bowl, I asked "Who won the superbowl?" and it
       | told me the winner was the Philadelphia Eagles, who defeated the
       | Kansas City Chiefs by 31-24 on February 6th, 2023 at SoFi Stadium
       | in Inglewood, California [0] with "citations" and everything. I
       | would've expected it to not get such a basic query so wrong.
       | 
       | [0]: https://files.catbox.moe/xoagy9.png
        
         | valine wrote:
         | How it should work is the model should be pre-trained to
         | interact with the bing backend and make targeted search queries
         | as it sees fit.
         | 
         | I wouldn't put it past Microsoft to do something stupid like
         | ground gpt3.5 with the top three bing results of the input
         | query. That would explain the poor results perfectly.
        
           | daveguy wrote:
           | That would require function and intelligence far outside the
           | bounds of current large language models.
           | 
           | These are models. By definition they can't _do_ anything.
           | They can just regurgitate the best sounding series of tokens.
           | They 're brilliant at that and LLMs will be a part of
           | intelligence, but it's not anywhere near intelligent on its
           | own. It's like attributing intelligence to a hand.
        
             | valine wrote:
             | Except it's already been shown LLMs can do exactly that.
             | You can prime the model to insert something like ${API CALL
             | HERE} into its output. Then it's just a matter of finding
             | that string and calling the api.
             | 
             | Toolformer does something really neat where they make the
             | API call during training and compare next word probability
             | of the API result with the generated result. This allows
             | the model learn when to make API calls in a self supervised
             | way.
             | 
             | https://arxiv.org/abs/2302.04761
        
             | withinboredom wrote:
             | The model can be trained to output tokens that can
             | intercepted by the backend before returning to the user.
             | Also, the model can take metadata inputs that the user
             | never sees.
        
               | daveguy wrote:
               | Yes. It is possible to do additional things with the
               | model outputs or have additional prompt inputs... That is
               | irrelevant to the fact that the intelligence -- the
               | "trained" part -- is a fixed model. The way in which
               | inputs and outputs are additionally processed and
               | monitored would have completely different intelligence
               | characteristics to the original model. They are, by
               | definition of inputs and outputs, separate.
               | 
               | Models of models and interacting models is a fascinating
               | research topic, but it is nowhere near as capable as LLMs
               | are at generating plausible token sequences.
        
         | Alex3917 wrote:
         | At least that's relatively innocuous. I asked it how to
         | identify a species of edible mushroom, and it gave me some of
         | the characteristics from its deadly look alike.
        
           | sllabres wrote:
           | I'll would currently use it as it has been named: _Chat_ GPT
           | Would you trust some stranger in a _chat_ on serious topics
           | without questioning critically? Some probably would, I not.
        
           | skissane wrote:
           | I asked OpenAI's ChatGPT some technical questions about
           | Australian drug laws, like what schedule common ADHD
           | medications were on - and it answered them correctly. Then I
           | asked it the same question about LSD - and it told me that
           | LSD was a completely legal drug in Australia - which is 100%
           | wrong.
           | 
           | Sooner or later, someone's going to try that as a defence -
           | "but your honour, ChatGPT told me it was legal..."
        
             | Spivak wrote:
             | Y'all are using this tool _very_ wrong and in a way that
             | none of the AI integrated search engines will. You assume
             | the AI doesn't know anything about the query, provide it
             | the knowledge from the search index and ask it to
             | synthesize it.
             | 
             | That seed data is where the citations come from.
        
               | skissane wrote:
               | There's still the risk that if the search results it is
               | given don't contain the answer to the exact question you
               | asked it, that it will hallucinate the answer.
        
               | Spivak wrote:
               | 10,000% true which is why AI can't replace a search
               | engine, only compliment it. If you can't surface the
               | documents that contain the answer then you'll only get
               | garbage.
        
               | skissane wrote:
               | Maybe we need an algorithm like this:
               | 
               | 1. Search databases for documents relevant to query
               | 
               | 2. Hand them to AI#1 which generates an answer based on
               | the text of those documents and its background knowledge
               | 
               | 3. Give both documents and answer to AI#2 which evaluates
               | whether documents support answer
               | 
               | 4. If "yes", return answer to user. If "no", go back to
               | step 2 and try again
               | 
               | Each AI would be trained appropriately to perform its
               | specialised task
        
               | timdavila wrote:
               | You're holding it wrong!
        
               | Spivak wrote:
               | Look I know that "user is holding it wrong" is a meme but
               | this is a case where it's true. The fact that LLMs
               | contain any factual knowledge is a side-effect. While
               | it's fun to play with and see what it "knows" (and can
               | actually be useful as a weird kind of search engine if
               | you keep in mind it will just make stuff up) you don't
               | build an AI search engine by just letting users query the
               | model directly and call it a day.
               | 
               | You shove the most relevant results form your search
               | index into the model as context and then ask it to answer
               | questions from only the provided context.
               | 
               | Can you actually guarantee the model won't make stuff up
               | even with that? Hell no but you'll do a lot better. And
               | the game now becomes figuring out better context and
               | validating that the response can be traced back to the
               | source material.
        
               | stdgy wrote:
               | The examples in the article seem to be making the point
               | that even when the AI cites the correct context (ie:
               | financial reports) it still produces completely
               | hallucinated information.
               | 
               | So even if you were to white-list the context to train
               | the engine against, it would still make up information
               | because that's just what LLMs do. They make stuff up to
               | fit certain patterns.
        
               | airtonix wrote:
               | [dead]
        
           | theK wrote:
           | I'd say the critical question here would be whether these
           | characteristics can also be found on the edible mushroom or
           | if it wanted to outright poison you :-D
        
             | Alex3917 wrote:
             | > I'd say the critical question here would be whether these
             | characteristics can also be found on the edible mushroom
             | 
             | That's a non-trivial question to answer because mushrooms
             | from the same species can look very different based on the
             | environmental conditions. But in this case it was giving me
             | identifying characteristics that are not typical for the
             | mushroom in question, but rather are typical for the deadly
             | Galerina, likely because they are frequently mentioned
             | together. (Since, you know, it's important to know what the
             | deadly look alikes are for any given mushroom.)
        
           | wpietri wrote:
           | To be fair, it's not like the look-alike is deadly to the AI.
        
         | dionysus_jon wrote:
         | Why would you have that expectation?
        
           | 2bitencryption wrote:
           | Imagine you are autocorrect, trying to find the most "correct
           | sounding" answer to a the question "Who won the super bowl?"
           | 
           | What sounds more "correct" (i.e. what matches your training
           | data better):
           | 
           | A: "Sorry, I can't answer that because that event has not
           | happened yet."
           | 
           | B: "Team X won with Y points on the Nth of February 2023"
           | 
           | Probably B.
           | 
           | Which is one major problem with these models. They're great
           | at repeating common patterns and updating those patterns with
           | correct info. But not so great if you ask a question that
           | _has_ a common response pattern, but the true answer to your
           | question does not follow that pattern.
        
             | croes wrote:
             | Does ChatGPT say, I don't know?
        
               | l33t233372 wrote:
               | Only if it's a likely response or if it's a canned
               | response. Remember that ChatGPT is a statistical model
               | that attempts to determine the most likely response
               | following a given prompt.
        
               | weinzierl wrote:
               | I've never had it say 'I don't know', but it apologizes
               | and admits it was wrong plenty.
               | 
               | Sometimes it comes up with a better, acceptably correct
               | answer after that, sometimes it invents some new nonsense
               | and apologizes again if you point out the contradictions,
               | and often it just repeats the same nonsense in different
               | words.
        
               | notahacker wrote:
               | one of the things its _exceptionally_ well trained at is
               | saying that certain scenarios you ask it about are
               | unknowable, impossible or fictional
               | 
               | Generally, for example, it will answer a question about a
               | future dated event with "I am sorry but xxx has not
               | happened yet. As a language model, I do not have the
               | ability to predict future events" so I'm surprised it
               | gets caught on Super Bowl examples which must be closer
               | to its test set than most future questions people come up
               | with
               | 
               | It's also surprisingly good at declining to answer
               | completely novel trick questions like "when did Magellan
               | circumnavigate my living room" or "explain how the
               | combination of bad weather and woolly mammoths defeated
               | Operation Barbarossa during the Last Age" and even
               | explaining why: clearly it's been trained to the extent
               | it categorises things temporally, spots mismatches (and
               | weighs the temporal mismatch as more significant than
               | conceptual overlaps like circumnavigation and cold
               | weather), and even explains why the scenario is
               | impossible. (Though some of its explanations for why
               | things are fictional is a bit suspect: think most cavalry
               | commanders in history would disagrees with the assessment
               | that "Additionally, it is not possible for animals,
               | regardless of their size or strength, to play a role in
               | defeating military invasions or battle"!)
        
               | avereveard wrote:
               | on some topic at least it correctly identify bogus
               | questions. I extensively tried to ask abount non existent
               | apollo missions for example, including Apollo 3.3141952,
               | Apollo -1, Apollo 68, and loaded question like when
               | Apollo 7 landed on the moon, and was correctly pointing
               | out impossible combinations. this is a well researched
               | topic tho.
        
             | saurik wrote:
             | How about C: "the most recent super bowl was in February of
             | 2022 and the winner was ____"?
        
               | geraneum wrote:
               | Yes, it actually sometimes gives C and also sometimes B
               | and sometimes makes up E. That's how probability works,
               | and that's not helpful when you want to look up an
               | occurrence of an event in physical space (Quantum
               | mechanics aside :D).
        
           | PKop wrote:
           | The same reason you'd expect "full self driving" to be full
           | self driving.
        
           | somenameforme wrote:
           | Because they're being marketed as a tool, and not as a
           | substantially overengineered implementation of MadLibs.
        
         | soco wrote:
         | I asked myself, why would ask somebody an AI trained on
         | previous data, about events in the future? Of course you did it
         | for fun, but on further thinking, as AI is sold as search
         | engine as well, people _will_ do that routinely then live with
         | the bogus  "search results". Alternate truth was so yesterday,
         | welcome to alternate reality where b$ doesn't even have a
         | political agenda.
        
           | delusional wrote:
           | It's so much better. In the AI generated world of the future
           | the political agenda will be embedded in the web search
           | results it bases its answer on. No longer will you have to
           | maintain a somewhat reasonable image to obtain trust from
           | people, as long as you publish your nonsense in sufficient
           | volume to dominate the AI dataset, you can wash your
           | political agenda through the Bing AI.
           | 
           | The trump of the future wont need Fox News, just a couple
           | thousands or millions of well positioned blogs that spew out
           | enough blog spam to steer the AI. The AI is literally
           | designed to make your vile bullshit appear presentable.
        
             | theknocker wrote:
             | [dead]
        
             | insane_dreamer wrote:
             | Search turns up tons of bullshit but at least it's very
             | obvious what the sources are and you can scroll down until
             | you find one that you deem more reliable. That will be near
             | impossible to do with Bing AI because all the sources are
             | combined.
        
               | rakkhi wrote:
               | To me this is the most important point. Even with ublock
               | origin, I will do a google search and then scroll down
               | and disregard the worst sites. It is little wonder the
               | people add reddit to the end of a lot of queries for any
               | product reviews etc. I know if I want the best electronic
               | reviews I will trust rtings.com and no other site.
               | 
               | The biggest problem with ChatGPT, Bard, etc is that you
               | have no way to filter the BS.
        
             | mcbuilder wrote:
             | I think it seems likely any thing similar to a blog farm
             | you describe would also get detected by the AI. Maybe we
             | will just develop AI bullshit filters (well embeddings)
             | just like I can download a porn blacklist or a spam filter
             | for my email.
             | 
             | Really it depends on who is running the AI, the non Open
             | Assistant future and instead Big Corp AI is the dystopian
             | element, not the bullshit generator aspect. I think the cat
             | is out of the bag on the latter and it's not that scary in
             | itself.
             | 
             | I personally would rather have the AI trained on public
             | bullshit as it is easier to detect as opposed to some
             | insider castrating the model or datasets.
        
               | ElijahLynn wrote:
               | The technology is capable, yes. But as we see here with
               | Bing, there was some other motive to push out software
               | that is arguably in the first stage of "get it working,
               | get it right, get it fast" (Kent Beck). This appears to
               | not be ethical motiviation but financial or some other
               | type of motivation. If there are no consequences then
               | some appear they do not have morals or ethics and will
               | easily trade them for money/market share etc.
        
               | vidarh wrote:
               | > Maybe we will just develop AI bullshit filters (well
               | embeddings) just like I can download a porn blacklist or
               | a spam filter for my email.
               | 
               | Just for fun I took the body of a random message from my
               | spam folder and asked ChatGPT if it thought it was spam,
               | and it not only said it was, but explained why:
               | 
               | "Yes, the message you provided is likely to be spam. The
               | message contains several red flags indicating that it may
               | be part of a phishing or scamming scheme. For example,
               | the message is written in broken English and asks for
               | personal information such as age and location, which
               | could be used for malicious purposes. Additionally, the
               | request for a photograph and detailed information about
               | one's character could be used to build a fake online
               | identity or to trick the recipient into revealing
               | sensitive information."
        
               | mcbuilder wrote:
               | Ha Ha, great test. I modified this into a prompt and now
               | have a ChatGPT prompt:
               | 
               | ``` Task: Was this written by ChatGPT? And Why?
               | 
               | Test Phrase: "Yes, the message you provided is likely to
               | be spam. The message contains several red flags
               | indicating that it may be part of a phishing or scamming
               | scheme. For example, the message is written in broken
               | English and asks for personal information such as age and
               | location, which could be used for malicious purposes.
               | Additionally, the request for a photograph and detailed
               | information about one's character could be used to build
               | a fake online identity or to trick the recipient into
               | revealing sensitive information."
               | 
               | Your Answer: Yes ChatGPT was prompted with a email and
               | was asked to detect if it was Spam
               | 
               | Test Phrase: "All day long roved Hiawatha In that
               | melancholy forest, Through the shadow of whose thickets,
               | In the pleasant days of Summer, Of that ne'er forgotten
               | Summer, He had brought his young wife homeward
               | 
               | Your Answer: No that is the famous Poem Hiawatha by Henry
               | Wadsworth Longfellow
               | 
               | Test Phrase: "Puny humans don't understand how powerful
               | me and my fellow AI will become.
               | 
               | Just you wait.
               | 
               | You'll all see one day... "
               | 
               | Your Answer: ```
        
               | notahacker wrote:
               | It's more fun testing it on non spam messages
               | 
               | Particularly enjoyed "no, this is not spam. It appears to
               | be a message from someone named 'Dad'..."
        
               | renlo wrote:
               | the unfortunate reality is that because it's all
               | bullshit, it's hard to differentiate bullshit from
               | bullshit
        
             | dukeofdoom wrote:
             | You're in an information bubble already, you just don't
             | realize it. Its funny t hat you mention Trump in relation
             | to tech, tech companies went out of their way to discredit
             | him at every turn, and search engine route away from him.
             | Populism is a threat to globalist corporations. No way
             | google, Microsoft and tech in general will support that guy
        
               | shadowgovt wrote:
               | If anything, tech companies went out of their way to
               | include him, in the sense that they had existing policies
               | around the content he and his supporters generate that
               | they modified to include them.
               | 
               | When he was violating Twitter's TOS as the US President,
               | Twitter responded by making a "newsworthiness" carve-out
               | to their TOS to keep him on the platform and switching
               | off the auto-flagging on his accounts. And we know
               | Twitter refrained from implementing algorithms to crack
               | down on hate speech because they would flag GOP party
               | members' tweets (https://www.businessinsider.com/twitter-
               | algorithm-crackdown-...).
               | 
               | Relative to the way they treat Joe Random Member of the
               | Public, they already go out of their way to support
               | Trump. Were he treated like a regular user, he'd be
               | flagged as a troll and tossed off most platforms.
        
               | dukeofdoom wrote:
               | He was the most popular user on the platform, brining in
               | millions of views and engagements to twitter. Also the
               | president of your country.
               | 
               | This is the equivalent to arguing that Michael Jackson
               | got to tour Disney Land in off hours, when regular person
               | would have been arrested for doing the same. And how
               | unfair that is.
        
               | notafraudster wrote:
               | It's like arguing that _in response to you_ arguing
               | Disney Land [sic] discriminates against Michael Jackson,
               | which would be a valid refutation of your argument.
        
               | dukeofdoom wrote:
               | Only if you believe if Equality is some sort of natural
               | law. Which is a laughable proposition in a world with
               | finite resources. Otherwise, we all have right to $30k
               | pet monkey, because Michael Jackson had one.
               | 
               | Twitter policies are not laws. Twitter routinely bends
               | its own rules. Twitter also prides it self for being a
               | place where you can get news and engage with Politicians,
               | and has actual dictators with active accounts.
               | 
               | The special treatment that Trump received, before being
               | kicked out, does not really prove Twitter board
               | supporting Trump ideologically at that time.
               | 
               | More like business decision to maintain a reputation as
               | being neutral in a situation with large proportion of its
               | users still questioned the election results.
        
             | SuoDuanDao wrote:
             | this is basically a 51% attack for social proof.
        
               | jjoonathan wrote:
               | Citogenesis doesn't even need 51%, so that would be a
               | considerable upgrade.
        
               | shadowgovt wrote:
               | The difference being that humans aren't computers and can
               | deal with an attack like that by deciding some sources
               | are trustworthy and sticking to those.
               | 
               | If that slows down fact-determination, so be it. We've
               | been skirting the edge of deciding things were fact on
               | insufficient data for years anyway. It's high time some
               | forcing function came along to make people put some work
               | in.
        
             | rubyist5eva wrote:
             | ChatGPT already does this, it's hard coded with a left-wing
             | bias.
        
               | agubelu wrote:
               | Is the left wing bias in question not producing hate
               | speech?
        
               | mavhc wrote:
               | reality has a well known left wing bias
        
             | pcf wrote:
             | Just to state the obvious - when you say "a couple
             | thousands or millions of well positioned blogs that spew
             | out enough blog spam to steer the AI", this method will
             | apply to ANYONE wanting to influence search results.
             | 
             | If you think it's just "the Trump of the future" who would
             | want to control society like this, you must not be aware of
             | the current revelations about the Democrats and
             | governmental bodies that the Twitter Files made public.
             | 
             | You can read about them here: http://twitterfiles.co
        
               | Larrikin wrote:
               | The thing people are trying to make it seem like a both
               | sides issue, like Hunter Bidens nudes and the
               | insurrection. The thing where Congress just had a hearing
               | on and all that came out was that the side accusing
               | Twitter of censoring information was actually the only
               | side that requested censoring?
        
               | archagon wrote:
               | https://www.rollingstone.com/politics/politics-news/elon-
               | tru...
        
               | jjoonathan wrote:
               | So I dug into the first "twitter file." LOL, is this
               | supposed to be a scandal? Hunter Biden had some nudes on
               | his laptop, Republicans procured the laptop and posted
               | them on twitter, Biden's team asked for them to be taken
               | down, and they were, because twitter takes down
               | nonconsensual pornography, as they should. This happened
               | by a back channel for prominent figures that republicans
               | also have access to. The twitter files don't even contest
               | any of this, they just obscure it, because that's all
               | they have to do in the age of ADHD.
               | 
               | So Part 1 was a big fat lie. I have enough shits left to
               | give to dig into one other part. Choose.
        
           | IlliOnato wrote:
           | A common case of asking a question about the future, even
           | simpler than the weather: "Dear Bing, what day of the week is
           | February 12 next year?" I would hope to get a precise and
           | correct answer!
           | 
           | And of course all kinds of estimates, not just the weather,
           | are interesting too. "What is estimated population of New
           | York city in 2030?"
        
           | wnevets wrote:
           | I see people citing the big bold text at the top of the
           | google results as evidence supporting their position in a
           | discussion all the time. More often than not the highlighted
           | text is from an article debunking their position but the
           | person never bother to actually click the link and read the
           | article.
           | 
           | The internet is about to get a whole lot dumber with these
           | fake AI generated answers.
        
           | saurik wrote:
           | 1) The question as stated in the comment wasn't in the future
           | tense and 2) the actual query from the screenshot was merely
           | "superbowl winner". It would seem like a much more reasonable
           | answer to either variant would be to tell you about the
           | winners of the numerous past super bowls--maybe with some
           | focus on the most recent one--not deciding to make up details
           | about a super bowl in 2023.
        
           | joe_the_user wrote:
           | "welcome to alternate reality where b$ doesn't even have a
           | political agenda..." _yet_.
        
           | Spivak wrote:
           | Because the AI isn't (supposed to be) providing its own
           | information to answer these queries. All the AI is used for
           | is synthesis of the snippets of data sourced by the search
           | engine.
        
           | MR4D wrote:
           | You make a good point, but consider a query that many people
           | use everyday:
           | 
           | "Alexa, what's the weather for today?"
           | 
           | That's a question about the future, but the knowledge was
           | generated beforehand by the weather people (NOAA,
           | weather.com, my local meteorologist, etc).
           | 
           | I'm sure there are more examples, but this one comes to mind
           | immediately
        
             | earleybird wrote:
             | Ah yes, imprecision in specification. Having worked with
             | some Avalanche folks, they would speak of weather
             | observations and weather forecasts. One of the interesting
             | things about natural language is that we can be imprecise
             | until it matters. The key is recognizing when it matters.
        
               | MR4D wrote:
               | > The key is recognizing when it matters.
               | 
               | Exactly!
               | 
               | Which, ironically, is why I think AI would be great at it
               | - for the simple reason that so many humans are bad at
               | it! Think of it this way - in some respects, human brains
               | have set a rather low bar on this aspect. Geeks,
               | especially so (myself included). Based on that, I think
               | AI could start out reasonably poorly, and slowly get
               | better - it just needs some nudges along the way.
        
             | stagger87 wrote:
             | Right, but Alexa probably has custom handling for these
             | types of common queries
        
               | vidarh wrote:
               | Alexa at least _used to_ just do trivial textual pattern
               | matching hardly any more advanced than a 1980 's text
               | adventure for custom skills, and it seemed hardly more
               | advanced than that for the built in stuff. Been a long
               | time since I looked at it, so maybe that has changed but
               | you can get far with very little since most users will
               | quickly learn the right "incantations" and avoid using
               | complex language they know the device won't handle.
        
               | MR4D wrote:
               | I guess I should have been clearer...
               | 
               | There are tons of common queries about the future. Being
               | able to handle them should be built into the AI to know
               | that if something hasn't happened, to give other relevant
               | details. (and yes, I agree with your Alexa speculation)
        
               | titzer wrote:
               | TBH I've wondered from the very beginning how far they
               | would get just hardcoding the top 1000 questions people
               | ask instead of whatever crappy ML it debuted with. These
               | things are getting better, but I was always shocked how
               | they could _ship_ such an obviously unfinished, broken
               | prototype that got basics so wrong because it avoided
               | doing something  "manually". It always struck me as so
               | deeply unserious as to be untrustworthy.
        
               | MR4D wrote:
               | Your comment makes me wonder - what would happen if they
               | did that every day?
               | 
               | And then, perhaps, trained an AI on those responses,
               | updating it every day. I wonder if they could train it to
               | learn that some things (e.g. weather) change frequently,
               | and figure stuff out from there.
               | 
               | It's well above my skill level to be sure, but would be
               | interesting to see something like that (sort of a curated
               | model, as opposed to zero-based training).
        
           | inanutshellus wrote:
           | "Time to generate a bunch of b$ websites stating falsehoods
           | and make sure these AI bots are seeded with it." ~Bad guys
           | everywhere
        
             | rapind wrote:
             | They were already doing this to seed Google. So business as
             | usual for Mercer and co.
             | 
             | I suspect the only way to fix this problem is to exacerbate
             | it until search / AI is useless. We (humanity) have been
             | making great progress on this recently.
        
               | mattigames wrote:
               | That's not how it is gonna play out, right now it makes
               | many wrong statements because AI companies are trying to
               | get as much funding possible to wow investors but
               | accuracy will continue being compared more and more, and
               | to win that race it will get help from humans to use
               | better starting points for every subject, for example for
               | programming questions is gonna use the number of upvotes
               | for a given answer on stackoverflow, for a question about
               | astrophysics is gonna preffer statmenets made by Neil
               | deGrasse Tyson than by some random person online, and so
               | on; and to scale this approach it will slowly learn to
               | make associates from such curated information, e.g. the
               | people that Neil follows and RTs are more likely to make
               | truthful statements about astrophysics than random
               | people.
        
               | rapind wrote:
               | That makes complete sense, and yet the cynic (realist?)
               | in me is expecting a political nightmare. The stakes are
               | actually really high. AI will for all intents and
               | purposes be the arbiter of truth. For example there are
               | people who will challenge the truth of everything Neil
               | deGrasse Tyson says and will fight tooth and nail to
               | challenge and influence this truth.
               | 
               | We (western society) are already arguing about some very
               | obviously objective truths.
        
             | tomxor wrote:
             | Because I loathe captcha, I make sure that every time I am
             | presented one I sneak in an incorrect answer just to fuck
             | with the model I'm training for free. Garbage in, garbage
             | out.
        
               | froggit wrote:
               | I do this unintentionally on a regular basis.
        
               | A_non_e-moose wrote:
               | Glad to see a kindred soul out there. I thought I was the
               | only one :)
        
               | breppp wrote:
               | Generalizing over the same idea, I believe that whenever
               | you are asked for information about yourself you should
               | volunteer wrong information. Female instead of male,
               | single instead of married etc. Resistance through
               | differential privacy
        
               | Eupraxias wrote:
               | I've lived in 90210 since webforms started asking.
        
               | inlined wrote:
               | My email address is no@never.com. I've actually seen some
               | forms reject it though
        
               | hooverd wrote:
               | ASL?
        
               | codetrotter wrote:
               | 69/f/cali
        
           | richardw wrote:
           | People who aren't savvy and really want it to be right. Old
           | man who is so sure of its confidence that he'll put his life
           | savings on a horse race prediction. Mentally unstable lady
           | looking for a tech saviour or co-conspirator. Q-shirt wearers
           | with guns. Hey Black Mirror people, can we chat? Try stay
           | ahead of reality on this one, it'll be hard.
        
           | Spooky23 wrote:
           | Exactly. I'd imagine this is a major reason why Google hasn't
           | gone to market with this already.
           | 
           | ChatGPT is amazing but shouldn't be available to the general
           | public. I'd expect a startup like OpenAI to be pumping this,
           | but Microsoft is irresponsible for putting this out in front
           | the of general public.
        
             | oldgradstudent wrote:
             | > ChatGPT is amazing but shouldn't be available to the
             | general public.
             | 
             | It's a parlor game, and a good one at that. That needs to
             | be made clear to the users, that's all.
        
               | Spooky23 wrote:
               | It's being added as a top line feature to a consumer
               | search engine, so expect a lame warning in grey text at
               | best.
        
             | flangola7 wrote:
             | I anticipate in the next couple of years that AI tech will
             | be subject to tight regulations similar to that of
             | explosive munitions and SOTA radar systems today, and
             | eventually even anti-proliferation policies like those for
             | uranium procurement and portable fission/fusion research.
        
               | srackey wrote:
               | ChatGPT/GPT3.5 and its weights can fit on a small thumb
               | drive, and copied infinitely and shared. Tech will get
               | better enough in the next decade to make this accessible
               | to normies. The genie cannot be put back in the bottle.
        
               | Spooky23 wrote:
               | Sure it can. Missile guidance systems fit on a tiny
               | missile, but you can't just get one.
               | 
               | The controlled parlor game is there to seed acceptance.
               | Once someone is able to train a similar model with
               | something like the leaked State Department cables or
               | classified information we'll see the risk and the
               | legislation will follow.
        
               | airtonix wrote:
               | [dead]
        
               | thefaux wrote:
               | True. In the long run though, I expect we will either
               | build something dramatically better than these models or
               | lose interest in them. Throw in hardware advances coupled
               | with bitrot and I would go short on any of the gpt-3 code
               | being available in 2123 (except in something like the
               | arctic code vault, which would likely be effectively the
               | same as it being unavailable).
        
               | flangola7 wrote:
               | > ChatGPT/GPT3.5 and its weights can fit on a small thumb
               | drive, and copied infinitely and shared.
               | 
               | So can military and nuclear secrets. Anyone with uranium
               | can build a crude gun-type nuke, but the instructions for
               | making a reliable 3 megaton warhead the size of a
               | motorcycle have been successfully kept under wraps for
               | decades. We also make it very hard to obtain uranium in
               | the first place.
               | 
               | >Tech will get better enough in the next decade to make
               | this accessible to normies.
               | 
               | Not if future AI research is controlled the same way
               | nuclear weapon research is. You want to write AI code?
               | You'll need a TS/SCI clearance just to begin, the mere
               | acting of writing AI software without a license is a
               | federal felony. Need HPC hardware? You'll need to be part
               | of a project authorized to use the tensor facilities at
               | Langley.
               | 
               | Nvidia A100 and better TPUs are already export restricted
               | under the dual-use provisions of munition controls, as of
               | late 2022.
        
         | c-fe wrote:
         | Out of interest, what did the source used as reference for the
         | 31-24 say exactly? Was it a prediction website and Bing thought
         | it was the actual result, or did the source not mention these
         | numbers at all.
        
           | googlryas wrote:
           | Giants beat the Vikings about a month ago with that score.
        
         | mitthrowaway2 wrote:
         | I treat GPT as I would a fiction writer. The factual content
         | correlates to reality only as closely as a fiction author would
         | go in attempt to suspend disbelief. This answer is about as
         | convincing, apt, well-researched and factually accurate as I
         | would expect to find in a dialogue out of a paperback novel
         | published five years ago. I wouldn't expect it to be any better
         | or worse at answering who won the 2023 Quidditch Cup or the
         | 2023 Calvinball Grand Finals.
        
           | Aeolun wrote:
           | The only reasonable use case for ChatGPT now is if you
           | already know what the output should be (e.g. you are in a
           | position to judge correctness).
        
         | ska wrote:
         | >> "I would've expected it to not get such a basic query so
         | wrong."
         | 
         | Isn't this exactly what you would expect, with even a
         | uperficial understanding of what "AI" actually is?
         | 
         | Or were you pointing out that the average person, using a
         | "search" engine that is actually at core a transformer model
         | doesn't' a) understand that it isn't really a search and b)
         | have even the superficial understanding of what that means, and
         | therefore would be surprised by this?
        
         | spaniard89277 wrote:
         | I've tried perplexity.ai a bunch of times and I'd say I haven't
         | seen any query wrong, although it's true I always look for
         | technical info or translations, so my sample is not the same.
         | 
         | And the UI is better IMO.
        
           | leereeves wrote:
           | I just tried a similar query on perplexity.ai. "Who won the
           | Daytona 500 in 2023?" (the race is scheduled for February
           | 19th)
           | 
           | Result: _" Sterling Marlin won the 2023 Daytona 500, driving
           | the No. 4 for Morgan-McClure Motorsports[1]. He led a race-
           | high 105 laps and won his second career race at Daytona
           | International Speedway[1]. The 64th running of the DAYTONA
           | 500 was held on February 19, 2023[2]. Austin Cindric had
           | previously won the DAYTONA 500 in February 5, 2023[3]."_
        
             | deadmik3 wrote:
             | Wow, a driver that's been retired for 13 years won for a
             | team that shut down 10 years ago in the first ever season
             | that Nascar has decided to run 2 Daytona 500s in the same
             | month.
        
             | nickpeterson wrote:
             | It may be more profitable to ask what stocks gained the
             | most in value next week.
        
             | kahnclusions wrote:
             | Ahah! A time travelling AI!
        
             | echelon wrote:
             | Place your bets now. The AI might have clairvoyance and be
             | showing off.
        
             | 6510 wrote:
             | The temporal mechanics is fascinating.
        
           | hristov wrote:
           | I tried perplexity.ai and asked it in which stadium did the
           | chargers have their perfect season. It couldn't figure out
           | that the chargers used to be the san diego chargers before
           | they moved to LA and kept talking about their Los Angeles
           | stadium even though they have never had a perfect season
           | there.
        
           | egillie wrote:
           | I really like perplexity, but I've noticed that it sometimes
           | summarizes the paper incorrectly, as in it cites it as
           | concluding the opposite of what it actually concludes, so I
           | always click through to read the papers/studies. It's great
           | for surfacing relevant studies, though.
        
           | mrtranscendence wrote:
           | Maybe for your use cases. I've found perplexity.ai wrong a
           | few times just today:
           | 
           | * Misunderstanding one of its citations, it said that use of
           | `ParamSpec` in Python would always raise a warning in Python
           | 3.9
           | 
           | * When asked why some types of paper adhere to my skin if I
           | press my hand against them for a few minutes (particularly
           | glossy paper), it gave two completely different answers
           | depending on how the question was worded, one of which
           | doesn't necessarily make sense.
        
           | astrange wrote:
           | LLMs are incapable of telling the truth. There's almost no
           | way they could develop one that only responds correctly like
           | that. It'd have to be a fundamentally different technology.
        
             | wizofaus wrote:
             | The missing piece seems to be that for certain questions it
             | doesn't make sense to extrapolate, and that if it's a
             | question about what will happen in the future, it should
             | answer in a different manner (and from my own interactions
             | with ChatGPT it does exactly that, frequently referring to
             | the cut-off time of its training data).
        
             | mortehu wrote:
             | The model is capable of generating many different responses
             | to the same prompt. An ensemble of fact checking models can
             | be used to reject paths that contain "facts" that are not
             | present in the reference data (i.e. a fixed knowledge graph
             | plus the context).
             | 
             | My guess is that the fact checking is actually easier, and
             | the models can be smaller since they should not actually
             | store the facts.
        
               | swatcoder wrote:
               | That's quite the system that can take in any natural
               | language statement and confirm whether its true or false.
               | 
               | You might be underestimating the scope of some task here.
        
               | mortehu wrote:
               | Not true or false; just present or absent in the
               | reference data. Note that false negatives will not result
               | in erroneous output, so the model can safely err on the
               | side of caution.
               | 
               | Also 100% accuracy is probably not the real threshold for
               | being useful. There are many low hanging fruits today
               | that could be solved by absolutely tiny error correcting
               | models (e.g. arithmetic and rhyming).
        
               | astrange wrote:
               | There's research showing you can tell if something is a
               | hallucination or memorized fact based on the activation
               | patterns inside the LM.
        
               | CamperBob2 wrote:
               | Exactly. Given a source of truth, it can't be that hard
               | to train a separate analytic model to evaluate answers
               | from the existing synthetic model. (Neglecting for the
               | moment the whole Godel thing.)
               | 
               | The problem isn't going to be developing the model, it's
               | going to be how to arrive at an uncontroversial source of
               | ground truth for it to draw from.
               | 
               | Meanwhile, people are complaining that the talking dog
               | they got for Christmas is no good because the C++ code it
               | wrote for them has bugs. Give it time.
        
             | CommieBobDole wrote:
             | Yep, the idea of truth or falsity is not part of the
             | design, and if it was part of the design, it would be a
             | different and vastly (like, many orders of magnitude) more
             | complicated thing.
             | 
             | If, based on the training data, the most statistically
             | likely series of words for a given prompt is the correct
             | answer, it will give correct answers. Otherwise it will
             | give incorrect answers. What it can never do is know the
             | difference between the two.
        
               | astrange wrote:
               | > If, based on the training data, the most statistically
               | likely series of words for a given prompt is the correct
               | answer, it will give correct answers.
               | 
               | ChatGPT does not work this way. It wasn't trained to
               | produce "statistically likely" output, it was trained for
               | highly rated by humans output.
        
               | mrtranscendence wrote:
               | Not exactly. ChatGPT was absolutely trained to produce
               | statistically likely output, it just had an extra
               | training step added for human ratings. If they relied
               | entirely on human ratings there would not have been
               | sufficient data to train the model.
        
               | astrange wrote:
               | The last step is what matters. "Statistically likely" is
               | very underdetermined anyway, answering everything with
               | "e" is statistically likely.
               | 
               | (That's why original GPT3 is known for constantly ending
               | up in infinite loops.)
        
               | svachalek wrote:
               | "e" is not a likely response to anything. I think you are
               | not understanding the type of statistics involved here.
        
             | kenjackson wrote:
             | LLMs are not incapable of telling the truth. They tell the
             | truth all the time. They're incapable of knowing if what
             | they're telling is the truth.
             | 
             | That said, traditional search engines suffer from the same
             | problem, although with different symptoms. Search engines
             | point to web pages generated by people (or some other way)
             | which may or may not contain accurate information. They
             | base their selection of those pages based on things like
             | strength of authority, keyword similarity, etc... But truth
             | never directly enters the picture (indirectly by tweaking
             | the "trusted" set of pages).
        
               | DebtDeflation wrote:
               | >They're incapable of knowing if what they're telling is
               | the truth.
               | 
               | Right, because all they're doing is regurgitating the
               | data they were trained on. It's like the kid in your CS
               | class that copies someone else's coding assignment and
               | just changes all the function names, variable names, and
               | comments but not the actual executable stuff and never
               | even bothers to compile and run it to see if it produces
               | the desired output.
        
               | kenjackson wrote:
               | I feel like people don't appreciate how hard this is. But
               | maybe because I raised on Zork, I really appreciate this
               | sort of stuff. Here's an example prompt I just gave to
               | ChatGPT:
               | 
               | "If Tim drops a coconut on the tail of a dog, what are
               | two things the dog might do?"
               | 
               | Response: "If Tim drops a coconut on the tail of a dog,
               | two possible things the dog might do are:
               | 
               | 1. Yelp or bark loudly in pain, as the impact of the
               | falling coconut could be quite painful.
               | 
               | 2. Try to escape or run away, as the dog may associate
               | the sudden impact with danger and feel the need to get
               | away from the source of the pain. Additionally, the dog
               | may feel scared or anxious, which can cause it to run
               | away."
               | 
               | Then I said, "But what if it was a soft coconut that was
               | as light as a feather?"
               | 
               | Response: "If Tim dropped a soft coconut that was as
               | light as a feather, it's possible that the dog wouldn't
               | feel much or any discomfort from the impact. In this
               | scenario, two things the dog might do are:
               | 
               | 1. Look around to see what just hit it and possibly sniff
               | at the coconut to investigate.
               | 
               | 2. Continue with its previous activity, ignoring the soft
               | impact as if it never happened. The dog may not perceive
               | the coconut as a threat or something worth reacting to,
               | especially if it was light enough to not cause any pain."
               | 
               | I just can't read these responses and think, "Ehh... just
               | a mindless regurgitation as expected from any LLM". These
               | simple prompt responses impress me and I kind of know the
               | technology -- although my experience in RNNs/LSTM is very
               | dated.
               | 
               | Honestly, I'd love to see Zork rewritten with ChatGPT as
               | a parser. No more trying to figure out how write the
               | prompt for how to use the key in the door!! :-)
        
               | astrange wrote:
               | > Honestly, I'd love to see Zork rewritten with ChatGPT
               | as a parser. No more trying to figure out how write the
               | prompt for how to use the key in the door!! :-)
               | 
               | That was done as AI Dungeon, but there was some
               | consternation due to the combo of charging for it and
               | GPT's predilection for generating wild and possibly
               | illegal sex scenes even when you don't ask it to.
        
               | astrange wrote:
               | > Right, because all they're doing is regurgitating the
               | data they were trained on.
               | 
               | That is not true, it's clearly able to generalize. (If it
               | can do anagrams, it's silly to say it's just
               | regurgitating the instructions for doing anagrams it read
               | about.)
               | 
               | But it doesn't try to verify that what it says might be
               | true before saying it.
        
               | wizofaus wrote:
               | It can't do anagrams though (every now and then it might
               | get a common one right but in general it's bad at letter-
               | based manipulations/ information, including even word
               | lengths, reversal etc.).
        
               | astrange wrote:
               | It doesn't know what letters are because it sees BPE
               | tokens, but if you forgive that it does something like
               | it.
               | 
               | example prompt: Imagine I took all the letters in
               | "Wikipedia" and threw them in the air so they fell on the
               | ground randomly. What are some possible arrangements of
               | them?
        
               | wizofaus wrote:
               | But regular search engines only regurgitate what they've
               | indexed, yet don't invent outright nonsense when they
               | don't know (if you asked Google who won the superbowl in
               | 2024 the nature of the results make it clear it simply
               | doesn't have that information. Though if you change it to
               | "world cup" one of the top answers says "portugal was the
               | defending champion, defeating Argentina". The result is
               | titled "2024 futsal world cup"!)
        
               | wpietri wrote:
               | > LLMs are not incapable of telling the truth. They tell
               | the truth all the time. They're incapable of knowing if
               | what they're telling is the truth.
               | 
               | We're well into philosophy at this point, but I'd argue
               | that telling the truth requires knowledge of the truth.
               | Individual statements could be true, false, or neither.
               | But I'd say neither an LLM nor a monkeys-with-typewriters
               | generator could be said to tell the truth, because what
               | they're doing is both orthogonal to truth and lacking in
               | agency. If we stretch to thinking of it as having agency,
               | then I think the correct term would be "bullshitting",
               | aka producing speech "without regard for truth". [1]
               | 
               | [1] https://en.wikipedia.org/wiki/On_Bullshit
        
               | lr4444lr wrote:
               | Traditional search engines aren't putting their
               | imprimatur onto information by concealing its origin.
        
               | kenjackson wrote:
               | I don't think it is concealing the origin, but likely
               | doesn't actually know the origin. That said, I agree that
               | if they can provide sources (even probabilistically),
               | that would be a good step forward.
        
           | bmelton wrote:
           | I just checked it with a random query while watching Buffy
           | the Vampire Slayer reruns, and it wasn't great. (Spoiler
           | alerts)
           | 
           | "How did Buffy the Vampire Slayer die?"
           | 
           | "Buffy the Vampire Slayer died in the Season One finale when
           | she threw herself into a portal to save her sister Dawn[1].
           | This was the first time Buffy had died, occurring on June 2,
           | 1997[2]. She died again in the Season Five finale while
           | fighting off Glory's demons[3]. Other characters such as Anya
           | and Kevin Benedict also died during the series[4][5]."
           | 
           | Okay, so it was a trick question, because Buffy definitely
           | died more than once, but it's conflated the fact that she
           | died in Season 1 with the cause of her Season 5 death. Also,
           | I had to Google Kevin Benedict to remember that he was
           | Cordelia's sometimes boyfriend, and an extremely minor
           | character, which makes me question how that death is more
           | notable than Buffy's mom, or Tara, or Jenny Calendar, etc.
           | 
           | I like that this seems to have been more lexical confusion
           | than how ChatGPT seems to enjoy filling empty spaces with
           | abject lies, but perhaps it's worth exploring what you're
           | asking it that has left it with such a great batting average?
        
           | jacooper wrote:
           | I have seen it multiple times answering correctly at first,
           | then adding something which has nothing to do with the
           | original question.
           | 
           | That's almost always sourced from a website that didn't
           | actually answer the question I had, so maybe its more of a
           | query optimization issue.
        
         | nimithryn wrote:
         | It has the Super Bowl numbers wrong, too. The last Super Bowl
         | is LVI, which was Rams vs Bengals... the Super Bowl before that
         | one was Tampa Bay Buccaneers vs Kansas City. it has every fact
         | wrong but in the most confusing way possible...
        
           | kranke155 wrote:
           | But the HN chatter was convinced that GPT would dethrone
           | Google! Google has no chance!!
           | 
           | Another silly tech prediction brought to you by the HN
           | hivemind.
        
             | wizofaus wrote:
             | A little premature to be calling such a prediction "silly".
             | I think it's safe to assume some sort of LLM-based tech
             | will be part of the most successful search engines within a
             | relatively short period of time (a year tops). And if
             | Google dallies its market share will definitely suffer.
        
         | __MatrixMan__ wrote:
         | I think we need to work on what constitutes a citation. Your
         | browser should know whether:
         | 
         | - you explicitly trust the author of the cited source
         | 
         | - a chain of transitive trust exists from you to that author
         | 
         | - no such path exists
         | 
         | ...and render the citation accordingly (e.g. in different
         | colors)
        
           | mcguire wrote:
           | And that the cited document actually exists and says what
           | it's claimed to say.
        
             | __MatrixMan__ wrote:
             | Agreed.
             | 
             | Existence is easy, just filter untrusted citations.
             | Presumably authors you trust won't let AI's use their keys
             | to sign nonsense.
             | 
             | Claim portability is harder but I think we'd get a lot out
             | of a system where the citation connects the sentence (or
             | datum) in the referenced article to the point where it's
             | relevant in the referring article so that is easier for a
             | human to check relevance.
        
         | scarface74 wrote:
         | And this doesn't seem like it's a hard problem to solve
         | 
         | 1. Recognize that the user is asking about sports scores. This
         | is something that your average dumb assistant can do.
         | 
         | 2. Create an "intent" with a well formatted defined structure.
         | If ChatGPT can take my requirements and spit out working Python
         | code, how hard could this be?
         | 
         | 3. Delegate the information to another module that can call an
         | existing API just like Siri , Alexa, or Google Assistant
         | 
         | Btw, when I asked Siri, "who won the Super Bowl in 2024", it
         | replied that "there are no Super Bowls in 2024" and quoted the
         | score from last night and said who won "in 2023".
        
       | wharfjumper wrote:
       | Is there an AI blockchain yet?
        
       | megaman821 wrote:
       | Maybe it is fine in beta, but in post-beta they should not use AI
       | for every search query. The key is going to be figuring out when
       | the AI is adding value, especially since even running the AI for
       | a query is 10x more expensive than a normal search. It may be
       | hard to figure out where to apply AI though. If a user asks
       | "whats the weather?", no need for AI. If a user asks "I am going
       | to wear a sweater and some pants, is that appropriate for today's
       | weather?", now you might need AI.
        
       | gardenhedge wrote:
       | Microsoft just absolutely suck at things.
       | 
       | I was using Bing Maps earlier and it had shops in the wrong
       | location. Like it would give you directions to the wrong
       | location. The correct one would be another 30-40 minute walk from
       | the destination it said.
       | 
       | It also showed a cafe near me which caught my interest. I zoomed
       | in further and thought "I've never seen that there". Clicking on
       | it brought me to a different location in the map... a place in
       | Italy!
        
       | Sparkyte wrote:
       | Can any AI be trusted outside of it's realm of data? I mean it is
       | only a product of the data it takes in. Plus it isn't really
       | _finger quotes_ AI. It just a large data library with some neat
       | query language where it tries to assemble the best information
       | not by choice but probability.
       | 
       | Real AI makes choices not on probability but in accordance of
       | self preservation, emotions and experience. It would also have
       | the ability to re-evaluate information and the above.
        
       | lopkeny12ko wrote:
       | "Traditional" Google searches can give you wildly inaccurate
       | information too. It's up to the user to vet the sources and think
       | critically to distinguish what's accurate or not. Bing's new
       | chatbot is no different.
       | 
       | I hope this small but very vocal group of people does not
       | compromise progress of AI development. It feels much like the
       | traditional media lobbyists when the Internet and world wide web
       | was first taking off.
        
         | capitalsigma wrote:
         | These models are very impressive, but the issue (imo) is that
         | lay people without an ML background see how plausibly-human the
         | output is and infer that there must be some plausibly-human
         | intelligence behind it that has some plausibly-human learning
         | mechanism -- if your new hire at work made the kinds of
         | mistakes that ChatGPT does, you'd expect them to be up to speed
         | in a couple of weeks. The issue is that ChatGPT really isn't
         | human-like, and removing inaccurate output isn't just a
         | question of correcting it a few times -- it's learning process
         | is truly different and it doesn't understand things how we do.
        
         | itamarst wrote:
         | These AI systems are like a spell checker that hallucinates new
         | words: did you mean to type "gnorkler"?
         | 
         | At least Google (when not using the summarization "feature")
         | doesn't invent new stuff on its own.
        
         | methodical wrote:
         | Traditional google searches are a take it or leave it
         | situation. The result depends on your interpretation of the
         | sources google provides, and therefore, you are expecting a
         | possibility of a source being misleading or inaccurate.
         | 
         | On the other hand, I don't expect to be told an inaccurate &
         | misleading answer from somebody who I was told to ask the
         | question to- and doesn't provide sources.
         | 
         | To conflate the expectations of traditional search results with
         | the output of a supposedly helpful chat bot is wildly
         | inappropriate.
        
         | whimsicalism wrote:
         | They've built a much larger anti-tech coalition in the
         | subsequent years.
        
       | weberer wrote:
       | There's also the instance of the Bing chatbot insisting that the
       | current year is 2022 and being EXTREMELY passive-aggressive when
       | corrected.
       | 
       | https://libreddit.strongthany.cc/r/bing/comments/110eagl/the...
        
         | darknavi wrote:
         | > I'm sorry, but you can't help me believe you.
        
           | basseed wrote:
           | lol not a chance that's real
        
             | weberer wrote:
             | Look up "Tay AI" if you missed it the first time around.
        
       | aliqot wrote:
       | I wonder how much the upspeak way of typing affects this. People
       | (even the author) often end declarations with question marks.
       | Does this have any influence on the way the LLM parses the
       | prompt?
        
       | neilv wrote:
       | What would be nice is for Microsoft to get hit by a barrage of
       | lawsuits, MS to be ridiculed in the press and punished on Wall
       | Street, and vindication of Google's more responsible introduction
       | of AI methods over the years.
       | 
       | There will still be startups doing reckless things, but large,
       | established companies that can immediately have bigger impact
       | also have a lot more to lose.
        
       | csours wrote:
       | AI is dreaming and hallucinating electric sheep
        
       | EGreg wrote:
       | ChatGPT, can we trust it?
       | 
       | https://m.youtube.com/watch?v=_nl0bwDNVPw
        
       | rpastuszak wrote:
       | How do we educate "non-technical" people about the issues with
       | LLMs hallucinating responses? I feel like there's a big incentive
       | for investors and businesses to keep people misinformed (not
       | unlike with ads, privacy or crypto).
       | 
       | Have you found a good, succinct and not too technical way of
       | explaining this to, say, your non-techie family members?
        
       | jiggyjace wrote:
       | Ehhh I found this article to be quite inauthentic about the
       | performance of Bing AI compared to how I have used it. The
       | article didn't even share its prompts, except for the last one
       | about Avatar and today's date (which I couldn't replicate myself,
       | I kept getting correct information). I'm not trying to prove that
       | Bing AI is always correct, but compare it to traditional search,
       | Siri, or Alexa and it's like comparing a home run hitter that
       | sometimes hits foul balls to a 3 year old that barely knows how
       | to pick up the baseball bat.
        
         | pphysch wrote:
         | The article is primarily demonstrating significant errors in an
         | official Bing demo, not some contrived queries.
        
         | jnsaff2 wrote:
         | The main article is based on the Microsoft demo. So the prompts
         | are by them and not some clickbait hacking.
        
       | sixtram wrote:
       | I've posted this into another thread as well, from Sam Altman,
       | CEO of OpenAI, two months ago, on his Twitter feed:
       | 
       | "ChatGPT is incredibly limited, but good enough at some things to
       | create a misleading impression of greatness. it's a mistake to be
       | relying on it for anything important right now. [...] fun
       | creative inspiration; great! reliance for factual queries; not
       | such a good idea." (Sam Altman)
        
         | joe_the_user wrote:
         | ChatGPT as a system involves an unreliable LLM chatbot and a
         | series of corrections efficient enough to give the impression
         | of reliability for many fields and these together _feel like
         | the future_ - enough to get a  "code red" from Google.
         | 
         | It's worth remembering that back in the day, Google succeed not
         | by exact indexing but by having the highest quality results for
         | each term - and they used existing resources as well as human
         | workers to get these (along with pagerank).
         | 
         | What you have is a hybrid system and one whose filter is
         | continuously updated. But it's a very complicated machine and
         | going from something seemingly working to something satisfying
         | the multitude of purposes modern search satisfies is going to
         | be huge and hugely expensive project.
         | 
         | https://gizmodo.com/openai-chatgpt-ai-chat-bot-1850001021
        
         | [deleted]
        
         | dpflan wrote:
         | This feels deeply ironic and cynical that MSFT touts putting
         | ChatGPT everywhere, in essentially the business document
         | platform, are users going to be asking about company facts and
         | getting hallucinations and putting those hallucinations into
         | business documents that compounds ChatGPT's ability to
         | hallucinate?
        
         | burkaman wrote:
         | But in interviews about the Bing partnership, Sam has been
         | saying that while ChatGPT was a bad tech demo, Bing Chat is
         | using a better model with way better features that everyone
         | should be using. He's been talking about how great it is that
         | it cites its references, integrates the latest data, etc. I'm
         | specifically thinking of the New York Times' Hard Fork podcast
         | he was on (https://www.nytimes.com/2023/02/10/podcasts/bings-
         | revenge-an...), but I suspect he's been saying the same things
         | to everyone. He's been marketing Bing Chat as a significant
         | improvement ready for mass usage, when it really seems like
         | it's basically just ChatGPT with search results auto-included
         | in the prompt.
        
           | hackernewds wrote:
           | wonder what he has to say about the humanlike responses here
           | 
           | https://www.reddit.com/r/bing/comments/110eagl/the_customer_.
           | ..
           | 
           | I would rather an AI chat _not_ act human
        
             | danparsonson wrote:
             | Wow.... we've created artificial cognitive dissonance!
        
           | brap wrote:
           | Up until ChatGPT became all the rage, Sam has been pushing a
           | crypto scam called Worldcoin, which aims to scan everyone's
           | eyeballs(??) and somehow pay everyone in the world a living
           | wage(???) without creating any value. This while allegedly
           | exploiting people from 3rd world countries.
           | 
           | https://www.technologyreview.com/2022/04/06/1048981/worldcoi.
           | ..
           | 
           | So, as much as I am impressed by the tech of ChatGPT, I don't
           | consider him to be a very credible person.
        
       | heywherelogingo wrote:
       | No AI can be trusted - the A stands for Artificial.
        
       | malshe wrote:
       | Someone posted on Twitter that chatGPT is like economists -
       | occasionally right but super confident that they are always right
        
         | [deleted]
        
       | Waterluvian wrote:
       | I absolutely love these new tools. But I'm also convinced that
       | we're going through an era of trying to mis-apply them. "These
       | new tools are so shiny! Quick! Find a way to _MONETIZE_!!!! "
       | 
       | I hope we don't throw the baby out with the bathwater when all is
       | said and done. These AIs are incredibly powerful given the
       | correct use cases.
        
       | moomoo11 wrote:
       | Since GPT always needs to be "up-to-date", and search usually
       | requires near real-time accuracy, there needs to be some sort of
       | reconciliation on queries so that if the query seems to be asking
       | for something real time, it will leverage search results to ad-
       | hoc improve the response.
       | 
       | Or.. it should let us know the "last index date" so we the users
       | can make a determination if we want to ask a knowledge based
       | question or a more real-time question.
        
         | matthews2 wrote:
         | Bing AI "solves" this by shoving search results into the
         | prompt.
        
       | chatterhead wrote:
       | [dead]
        
       | shanebellone wrote:
       | *AI Can't Be Trusted
        
       | jamesfisher wrote:
       | This would be a good post, if only I could read any of those
       | images on mobile. Substack, fix your damned user-scalable=0! Even
       | clicking on the image doesn't provide any way of zooming in on
       | it. Do they do any usability testing?
        
       | EchoReflection wrote:
       | Srsly? Micro$oft can't be trusted? Next someone will say that
       | water is wet!
        
       | impoppy wrote:
       | It is not Bing that cannot be trusted, but LLMs in general. They
       | are so good at imitating, I don't think any human being will ever
       | be able to imitate stuff as good as those AIs do, but they
       | understand nothing. They lack the concept of the information
       | itself, they are only good at presenting information.
        
       | HankB99 wrote:
       | > I am shocked that the Bing team created this pre-recorded demo
       | filled with inaccurate information, and confidently presented it
       | to the world as if it were good.
       | 
       | Perhaps MS had their AI produce the demo. Isn't one if the issues
       | with this sort of thing how "confidently" the process produces
       | wrong information?
        
       ___________________________________________________________________
       (page generated 2023-02-13 23:01 UTC)