[HN Gopher] Generative AI could make search harder to trust
___________________________________________________________________
Generative AI could make search harder to trust
Author : jedwhite
Score : 118 points
Date : 2023-10-05 17:13 UTC (5 hours ago)
(HTM) web link (www.wired.com)
(TXT) w3m dump (www.wired.com)
| anjel wrote:
| More than Pinterest?
| throwawaaarrgh wrote:
| Why are people calling them hallucinations and not just errors,
| flaws or bugs? You can't hallucinate if all of your perception is
| one internal state. Chatbots don't dream of electric sheep.
| crispycas12 wrote:
| Personally, I think confabulations would be a better term. To
| the best of my understanding, these AI rely on a model similar
| to the reconstructive theory of memory in humans. The
| connotation of the word confabulation indicates no
| maliciousness while highlighting the erroneous nature of the
| action.
| IronWolve wrote:
| it's almost like AI just repeats data its fed on, even incorrect
| data, without any real intelligence to determine if the data is
| correct.... /s
|
| Its not simply garbage in garbage out. There is no logic to
| verify and analyze the data. You are simply told what is popular
| in the data.
| lazide wrote:
| Unfortunately, that is also a sizable portion of the human
| population. AI definitely does it cheaper and at larger scale
| though!
| yetanotherloser wrote:
| I've definitely met a lot of people who fail the GPT test.
| aconsult1 wrote:
| All of a sudden the saying "eat your own dog food" takes a
| twist and is no longer fun.
| smt88 wrote:
| AI doesn't "just" repeat data. You can feed a LLM 100% fact-
| checked data and it'll still hallucinate.
|
| It's a core problem with generative AI and it can't be solved
| with better data.
| zpeti wrote:
| Here's what people don't understand: this is mostly good for
| google.
|
| The worse organic results are, the more people will click on paid
| links. This is WHY everyone on HN is complaining about search
| results, because google doesn't really have an incentive to give
| you really good results. They only need to be good enough to keep
| 95% of the population still using google, but mostly expecting
| the good results to be ads.
|
| Google ads are the equivalent of verification on FB and X. They
| just call it something different. The verified, high quality
| results will be paid.
| tivert wrote:
| We did it guys! We're definitely heading into a new era, one
| perfected by software engineers. I can't wait!
| jowea wrote:
| AI powered citogenesis!
|
| I'm starting to wish articles had inline citations as a standard.
| dredmorbius wrote:
| Inline as opposed to hyperlinks?
|
| Or would footnotes / sidenotes be acceptable?
| faizshah wrote:
| I started to go down a line of thinking where I think we might
| see a return to books in the next 3-5 years. The reason is that
| with a book it's a big collection of knowledge and people can
| post reviews about the quality of the book whereas on the web you
| have no way of knowing what quality of an article will be
| anymore.
| klyrs wrote:
| Only, amazon is now flooded with crapbooks written by
| artificial psychonauts and also reviews written by artificial
| psychonauts.
| notamy wrote:
| https://archive.ph/2023.10.05-165142/https://www.wired.com/s...
| infoseek12 wrote:
| Leaving aside the article to discuss the source for a moment.
| When did Wired become so antitech?
|
| There are good critical viewpoints but most of the articles they
| are putting out at this point read like bitter diatribes. Which
| is a shame because they used to be an excellent publication.
| cr__ wrote:
| People are generally more cognizant of the harms caused by the
| tech industry than they were even a few years ago.
| LikelyABurner wrote:
| You can find a plethora of critical viewpoints on Hacker News
| and the various blogs it links to which are well cognizant of
| the dangers of the tech industry.
|
| The problem isn't that Wired is critical, it's that they've
| gone weirdly reactionary and their writing has gone so mass
| market dumbed down that Some Random Guy's Blog is likely to
| have a better written and researched viewpoint.
| lazide wrote:
| They probably laid off almost everyone but some burnt out
| interns.
| robinsonb5 wrote:
| Plot twist: maybe the article was written by ChatGPT!
| lazide wrote:
| Better than 50/50 odds I'm guessing
| thejazzman wrote:
| This.
|
| The academic internet of the 90s is so far gone and while
| we're seeing a lot of magic lately, it's magic available to
| literally everybody for any and every purpose.
|
| We're rapidly seeing how boring and disappointing that is :(
| illwrks wrote:
| Putting journalists out of work I guess?
| salynchnew wrote:
| Recently an article came out where someone said that the company
| I work for is a big user of WebAssembly, but the reality is that
| we don't use it.
|
| After finding the contributed article (on a well-known news site,
| not Wired though), it looks like a tech founder might've been
| using ChatGPT to write an article about the uses for WASM. The
| arguments were generally sound, but I don't think that anyone did
| the work to manually check any of the facts they presented in it.
| notabee wrote:
| This is kind of like the advent of spellcheck, where a whole
| class of errors started to appear regularly in almost every
| article because publishers stopped paying for the human labor
| to manually review for things like homonym or word ordering
| errors. Except much worse, because it could allow spurious or
| even harmful facts to accrue and spread instead of just
| grammatical mistakes.
| lykahb wrote:
| The SEO garbage has been poisoning the search for years. Even
| before the chatbots it got to the point when most top results are
| crap. The LLM's can surely make it much worse, though.
| hashtag-til wrote:
| I think this is a given these days. LLMs likely will become the
| new single point of failure search.
|
| This is too much of a temptation for the SEO scum to resist.
| abujazar wrote:
| <<Could>>? Google has already been doing this for quite some
| time, at least in my region (Norway), and I'd say more than half
| of the suggestions Google provides as top results are false.
| nonrandomstring wrote:
| More amusing and frightening is when people search about
| themselves and turn up AI generated crap. Googling yourself was
| always a lucky grab bag, with the possibility of long-forgotten
| embarrassments being dragged up. But at least you'd have to face
| facts.
|
| Now I hear of people discovering they're in prison, married to
| random people they've never met, or are actually already dead.
|
| What is this going to do to recon on individuals (for example by
| employers, border agents or potential romantic partners) when
| there's a good chance the reputation raffle will report you as a
| serial rapist, kiddy-fiddler or Tory politician?
| vorpalhex wrote:
| This is a new way to be anonymous too. Someone post something
| true but nasty about you? Have LLMs cook up dozens of
| preposterous stories - you're secretly a rodeo clown, you write
| childrens books, you built a castle in Rome, you once drank a
| goldfish, etc.
|
| Increase noise to drown signal.
| kr0bat wrote:
| This is essentially the service Reuptation.com claims to
| provide. Jon Ronson's "So You've Been Publicly Shamed"
| describes the site games SEO to flood the search results of
| controversial figures with banal nothing posts[1]. The
| difference being that actual humans had to create that
| content.
|
| In the near future, the web could become opaque with LLM
| schlock, but at least it may grant people a right to be
| forgotten.
|
| [1]https://www.businessinsider.com/lindsey-stone--so-youve-
| been...
| acomjean wrote:
| I think Boris Johnson tried that by saying out of the blue:
| he makes model busses. There was some thinking at the time
| that he didn't want the brexit bus to show up in searches and
| was trying to game search results..
|
| I don't think it worked.
| JohnFen wrote:
| > Now I hear of people discovering they're in prison, married
| to random people they've never met, or are actually already
| dead.
|
| My real name is very, very common -- so this has been my
| reality for my entire life.
|
| These days, I have grown to appreciate it. It's like an
| invisibility superpower.
| p0w3n3d wrote:
| And entropy rises... people thought AI will kill us with machine
| guns. AI will kill us by making us super stupid...
| euroderf wrote:
| I have already externalized my to-do lists and other reminder
| lists to teh interwebz. I can't wait to outsource my faculties
| for reasoning too.
| ChatGTP wrote:
| And it's only $20 a month and it's useful !! I'm using it
| eVerYdAy to save hOuRS!!!
| 23B1 wrote:
| That really sucks for all the people whose job it is to make
| search impossible to trust already /s
| pseudosavant wrote:
| I wonder if there will be a human information/knowledge
| equivalent of low-background steel (pre-WWII/nukes). Data from
| before a certain point won't be 'contaminated' with LLM stuff,
| but it'll be everywhere after that.
|
| https://en.wikipedia.org/wiki/Low-background_steel
| thih9 wrote:
| I wonder how we'd test for AI contamination. And would there be
| attempts to sell a larger data set, one that pretends to be
| human generated, but instead is padded with some AI content.
|
| Does this mean we'd end up with a finite set of verified human
| only data?
|
| Would people start going through all kinds of offline archives
| via AI-gapped means, trying to uncover and document new sources
| of human input?
| ryanklee wrote:
| People are vastly everestimating how unique this problem of
| hallucinations is.
|
| It seems to me it relies mostly on discounting just how much
| we've already had to deal with this same problem in humans over
| the millenia.
|
| The problem of proliferation of bad information might be
| getting worse, but this isn't native to generative AI. The
| entire informational ecosystem has to deal with this. GPTs
| compound the issue, but as far as I can tell, no where near
| what social media has forced us to deal with.
| cscurmudgeon wrote:
| How do we know you are not hallucinating this comment?
| blibble wrote:
| humans can only produce semi-convincing bullshit at a limited
| rate
|
| with AI this limit is all but removed
|
| all the human generated bullshit ever created will soon be
| dwarfed by what AI can vomit out in an hour
| HappyDaoDude wrote:
| Like most things in the world. The problem isn't
| necessarily the technology but the scale at which it is
| implemented.
| wellthisisgreat wrote:
| Yeah if you think about it, there is no history for example,
| as all we have in that domain is just someone's perspective
| on some events. They may or may not have agenda but that's
| beside the point.
|
| That soft data could have never been trusted, rhe information
| that can be verified (calculations etc.) seems safe from LLM
| BobaFloutist wrote:
| The thing is when you call a human on bullshit, they usually
| can't back it up well enough to pass the smell test. When you
| call an AI on bullshit it can instantly fabricate plausible,
| credible seeming sources/evidence.
|
| A human's lie is different than an AI's hallucination, since
| it's still based on (distorting) the truth, whereas the
| hallucination is based on an invented reality (yes I know
| it's applied statistics and there's no true model of the
| world in there, but it can report as if there is)
| ryanklee wrote:
| Intelligent people can fill the void of ignorance with
| plausible sounding but factually incorrect information.
| They are apt to engage cognitive biases in such a way that
| the biases produce assertions that are deeply
| indistinquishable from factual assertions. They fool
| themselves in this way and they fool others. This happens
| all the time.
|
| LLMs are no different in this respect.
| gyudin wrote:
| It's not a big deal, there are many ways to handle it. It
| just has some overhead costs. LLMs that are offered to
| general public are more of a POC and they are making sure
| to use as little resources as possible.
| Agree2468 wrote:
| Right now is best time to buy encyclopedias.
| dotnet00 wrote:
| In some ways it already is that way. If I come across an artist
| I suspect is passing off AI generated stuff as their own
| (without using the tagging features the site has to indicate as
| much), an easy test is to just check if they've been posting
| since before ~2020. If they have, and the style has
| recognizable similarities, it's clear that it's honestly human
| made or at most blends characteristics of both together.
| BitwiseFool wrote:
| Those simple web 1.0 sites made by college professors are a
| gold-standard in my book. I always enjoy finding them in search
| results. Although they are becoming increasingly rare.
| dredmorbius wrote:
| Unfortunately, that's a trivial signal to emulate.
|
| At a minimum, you'd have to validate them by confirming
| existence in the Wayback Machine.
|
| Otherwise agreed that those are indeed high-signal documents.
| Increasing reliance on integrated educational software means
| that even such things as online syllabi are increasingly
| rare.
| LordDragonfang wrote:
| The type of sites GP is talking about are typically hosted
| on .edu servers, under faculty webhosting (often featuring
| a "/~profname/" in the url). That's a non-trivial signal.
| dredmorbius wrote:
| ~/name at an edu is pretty attainable.
|
| .edu domains can be had for any otherwise eligible
| "U.S.-based postsecondary institutions" per Educause:
| <https://net.educause.edu/eligibility.htm>
|
| Pages at _extant_ domains might variously be available to
| undergraduate or graduate students, faculty, staff, and
| adjuncts. Those might either directly host emulative
| material or be convinced or compromised into hosting
| content.
|
| If there's one thing that the Internet's history to date
| has proved, its that perverse incentives lead to perverse
| consequences.
| l33t7332273 wrote:
| It is not easy for a regular person to obtain access to a
| .edu webpage.
| [deleted]
| heavyset_go wrote:
| Can't prove it, but it seems to me like black text on white
| background sites from the past are poorly ranked compared to
| sites with "modern" layouts.
| hashtag-til wrote:
| Yes. I love black text on white background. A rare find
| these days.
|
| Browsing today is like: "You ask for a spaghetti recipe and
| the page tell you the whole history of civilization."
| zeroonetwothree wrote:
| Thats specific to recipes because they can't be
| copyrighted
| hashtag-til wrote:
| I had a look and definitely learned something today so
| #til.
|
| Also, note to self to collect my favourite recipes in
| markdown files from now on.
| MrVandemar wrote:
| search.marginalia.nu is a great place to find those sites,
| and some more interesting stuff besides.
| DayDollar wrote:
| There will be a web of trust, with a valuation of nodes by
| trustworthyness. And people will get only one id for this. Ones
| name is ones value and a reputation will be a hard earned thing
| again.
| ratg13 wrote:
| This was how the "internet" functions in the book "Ender's
| Game".
|
| There is a small sub-plot about how he had to give a fake
| persona credibility on the untrusted network in order to be
| able to leverage a creating a fake account on the trusted
| network.
| dredmorbius wrote:
| I find the xkcd interpretation more realistic:
| <https://xkcd.com/635/>
|
| Explained: <https://www.explainxkcd.com/wiki/index.php/635:
| _Locke_and_De...>
| notahacker wrote:
| I love that interpretation, but in today's retweet driven
| world of politically commentary, I actually find it quite
| plausible that pseudonymous kids with no grasp of the
| real world who _think_ rational political debate is the
| nonsensical slogans they 're spouting on the internet
| become major Twitter influencers that actual politicians
| want to court for their "authenticity" and "willingness
| to say the unsayable", and maybe their dank memes.
| dredmorbius wrote:
| The conceit of _Ender 's Game_ was that _thoughtful_
| discourse would be influential online.
|
| Reality has largely demonstrated that far more
| thoughtless propaganda of the Big Lie, Firehose of
| Bullshit (or Falsehood), associated with Russia, floods
| of irrelevance which tend to bury more significant
| stories, favoured by China, and outrage / hot-button
| topics, which are common in US-centric media, though a
| timeless technique.
|
| Memes and simple messages attract attention and spread.
| Complex narratives and analyses ... not so much.
|
| But yes, voices that deserve no attention whatsover have
| dominated the media landscape of the past decade or so.
| Not that this is _entirely_ novel.
| carlosjobim wrote:
| Isn't this how it has been since the dawn of time?
| RandomWorker wrote:
| My sense is to avoid this have a personal blog.
|
| That being said how many people write blogs with grammerly or
| chatgpt these days. The temptation to use these technologies
| all the time is too strong for even self preservation of your
| own (writers) voice.
|
| My sense is that you use this technology you might be happy
| with the results at first but on later review you just notice
| something off in some sentences and maybe it just doesn't flow
| right. I'm not convinced that it will replace writers jobs yet.
| Especially when you want to create something authentic and
| unique.
| pseudosavant wrote:
| Sometimes the value is specifically because my voice won't
| come through. When I'm stressed and being asked for
| unreasonable things at work, I know that I tend toward
| passive aggression. But professionally, that isn't the way I
| want my message to come across.
|
| I use ChatGPT all the time to suggest how I could make sure
| something isn't passive aggressive. It'll point out parts
| that aggression and suggested changes. It can be for a short
| slack message, or a many paragraph message.
| floren wrote:
| I have definitely read "blogs" written by stitching together
| LLM outputs. For years people were advised that a technical
| blog "looks good on a resume" so we saw lots of lightly
| rewritten Stackoverflow content. Now it's gotten easier.
| tredre3 wrote:
| > The temptation to use these technologies all the time is
| too strong for even self preservation of your own (writers)
| voice.
|
| I don't know about that. I have played with
| ChatGPT/Copilot/etc enough to know what they're capable of
| doing. But the thing is, I enjoy programming. I enjoy
| breaking down a problem and solving it with code. I enjoy
| crafting elegant code. So I don't use AI even though I'm
| fully aware it could save me hours on projects. Why? Because
| I enjoy those hours very much.
|
| Why am I telling you all this? Because I suspect many writers
| are the same and personal blogs are their canvas. They enjoy
| communicating. They enjoy crafting articles. They might have
| AI proof-read them, but they won't let them write everything.
| So, to me, there is hope that personal blogs will maintain
| their human element, as opposed to news websites or tabloids
| or learning platforms.
| steelframe wrote:
| > So I don't use AI even though I'm fully aware it could
| save me hours on projects.
|
| Enjoy this luxury while it lasts. Based on what I have seen
| in performance review committees for software developers,
| your peers who drive results faster than you do because
| they use AI will be rewarded more and will be more likely
| to survive rounds of layoffs when they inevitably happen.
| JohnFen wrote:
| That's fine. I genuinely wouldn't want to continue
| working in an industry that worked like that anyway, so
| I'd just quit and keep on programming with my own
| projects. So that luxury will last as long as I want it
| to.
| SoftTalker wrote:
| Agree. I've never even looked at any of these AI tools. I
| enjoy the process and the challenge of programming, and the
| rewards of doing it well. I have no desire for someone or
| something else to write code for me.
| robinsonb5 wrote:
| I suspect in the coming years the Wayback Machine at
| archive.org will become ever more important - always assuming
| it's not lost as collateral damage in their copyright battles.
| Indexing that dataset and making it searchable would massively
| increase its value.
|
| My inner conspiracy theorist can't help wonder if the continued
| reduction in search usefulness isn't part of an ongoing
| deliberate disempowerment of everyday people - but my rational
| side says it's merely an unfortunate emergent behaviour of the
| systems we've built.
| carlosjobim wrote:
| The shadow libraries.
| datadrivenangel wrote:
| There's the branch of philosophy called epistemology.
| LetsGetTechnicl wrote:
| Just another reason that I consider generative AI to be a lot
| like crypto. A lot of talk about it being the future but really
| only turns out to be dangerous or useless. I find it incredibly
| irresponsible that companies are shoving their latest AI tech
| into all their products when it's still unproven.
| stevenwoo wrote:
| One thing I've noticed about simple one word searches on Bing
| now - a lot of times it just errors out and closes the Bing app
| tab you've opened with no explanation to the user. This only
| started happening after they pushed the AI driven search
| narrative to make you use it in the app, so apparently single
| word searches are too much somehow for their version of AI to
| handle.
| happytiger wrote:
| AI has so completely disrupting Search that it's destroyed
| leading platforms effectiveness in a matter of months.
|
| But because of its current lack of optimization for accuracy,
| we shouldn't consider it disruptive because it's not yet proven
| technology?
|
| You can call it dangerous but you can't call it useless. It's
| also only going towards improvement from here, including
| drastic reductions in hallucinations.
|
| You have to remember too that AI models are generally
| attempting to interpret the intent behind the prompt, so many
| of these crazy articles are happening because people aren't yet
| good at writing clear instructions for AI and AI isn't yet
| mature enough to disambiguate poor instructions in its output
| and is trying to deliver on unclear instructional intents.
| [deleted]
| 12_throw_away wrote:
| > It's also only going towards improvement from here
|
| Why?
| 0xEFF wrote:
| See for yourself, 4.0 is clearly improved over 3.5.
| ChatGTP wrote:
| True, 5 is a bigger number than 4 so logically it makes
| sense.
| pseudosavant wrote:
| Except, unlike crypto, ChatGPT helps me with real day things
| that I easily find at least $20/month of value from.
| figassis wrote:
| I think we all saw this coming, talked about it, articles were
| published even...but now its news
| gumballindie wrote:
| The correct term is spamming. People are using these text
| generators to spam everyone and everything under the sun. It will
| be detrimental to the internet as many people will just give on
| this huge pile of ... spam.
| kiernanmcgowan wrote:
| Without naming the company, I have seen specific examples of blog
| posts being written by AI, hallucinating a "fact", and then that
| "fact" re-surfacing inside of Bard.
|
| Its xkcd's Citogenesis automated and at internet scale
| https://xkcd.com/978/
| mattlondon wrote:
| Or to use the technical term: "shat the bed". Welcome to the
| future.
| Condition1952 wrote:
| Please get your answers from Anna's Library
| abruzzi wrote:
| I have to say--the opening paragraph doesn't describe a reality
| I'm familiar with:
|
| >Web search is such a routine part of daily life that it's easy
| to forget how marvelous it is. Type into a little text box and a
| complex array of technologies--vast data centers, ravenous web
| crawlers, and stacks of algorithms that poke and parse a query--
| spring into action to serve you a simple set of relevant results.
|
| Web search has, for me, become a nasty twisted hall of mirrors
| well before generative AI. I almose never get fed relevant
| results, I alsmost always have to go back and quote all my search
| terms because the search engine decided it didn't really need to
| use all of them (usually just one.) The only difference is the
| poison was human generated. generative AI will simply erase the
| 5% of results that might give me an answer quickly.
| meowface wrote:
| I've had the exact same experience. That said, when I do add
| all the right quotes and conditions to the query to filter out
| the blog/newsspam drivel, I still - usually - eventually - get
| pretty good results. Sometimes I have to switch to Bing or even
| Yandex, but it's rare.
|
| Adding "reddit" to queries can be pretty useful. You're prone
| to get terrible, inaccurate information since it's just random
| people on an internet forum, but at least it's (usually) actual
| humans and not blogs trying to SEO-game. (Though one big caveat
| is searching for products/services. Lots of threads full of bot
| accounts writing "[link] has been the best [thing], in my
| experience". They're usually easy to spot, but sometimes they
| do seem pretty natural until you check the post history.)
| ryandrake wrote:
| > You're prone to get terrible, inaccurate information since
| it's just random people on an internet forum, but at least
| it's (usually) actual humans and not blogs trying to SEO-
| game.
|
| Less and less so. Reddit has always had a bot problem, but it
| seems to be getting exponentially worse lately. Not just
| article reposters, but comment reposters, bots that reverse
| images and videos just to repost, seems like it's at least
| 75% bot content now.
| bnralt wrote:
| Not only that, but you're also left with the issue of parsing
| what someone else has written. Even when using answers I find
| from web searches, I often drop results into ChatGPT so I can
| get a rough idea of what the person is trying to say first, or
| check if it agrees with my understanding of what's being said.
| jfengel wrote:
| I experience that when I try to google for technical problems
| I'm having at work, but otherwise searches still go pretty well
| for me.
|
| I just had to google a bunch of races that I wanted to run. The
| top result was always the event's own web site.
|
| When I google some news, relevant news articles always come up.
|
| The last search I did was for how to display a ket vector in
| LaTeX. The top result was the StackExchange article with the
| right answer.
|
| From what I see, certain domains seem to be targeted for
| exploitation. Programming questions seem to be high up on the
| list. I wonder if that skews HN readers' perceptions.
| JSavageOne wrote:
| Google search to retrieve specific factual information is
| pretty good.
|
| Google search to retrieve anything opinion related has been
| horrible and infested with blogspam for years (hence people
| searching Reddit to get that kind of info).
| jamal-kumar wrote:
| Really? I've been finding it doesn't even find stuff it
| used to in certain documentation (I'm talking like things
| it found maybe a year ago), "searching in quotes for this
| stuff", things that other search engines (bing, kagi) are
| indexing just fine - And since I've switched to using these
| engines more when I'm searching things for programming
| work, it's definitely been a lot more helpful than google
| which often just seems to be missing a ton now
| jfengel wrote:
| I suppose it never occurs to me to search for opinions. I'm
| not even sure how I'd got about it, even if search weren't
| broken. Blogspam is what I'd expect to see.
|
| I'm more likely to start at a place that aggregates reviews
| and try to hallucinate which ones were written by people
| who know what they're talking about. That usually seems to
| work.
|
| I imagine that somewhere out there is a person who bought
| the product and reviewed it on their blog or made some
| enthusiastic social media post about it, and that's what
| you'd want to locate were it not for the spam. But I don't
| expect any search engine to be able to find it for me.
| fnordpiglet wrote:
| Google search to retrieve product marketing pages is pretty
| good. Specific factual information searches lead to product
| marketing pages. Opinion searches lead to product marketing
| pages.
|
| Google is a giant adware tool that's been taken over by
| adware SEO sites. The example given - find the product
| marketing pages for some races - falls directly in its
| sweet spot. If you venture outside it'll do its best to get
| you back into the product marketing sweet spot, and the SEO
| companies of the world take care of the rest.
|
| Search is a lost cause.
| icyberbullyu wrote:
| As someone who has been using search engines since the 90's,
| I've found that the "old-school" way of formatting your search
| almost like a database query has gotten significantly worse. It
| seems like search engines are geared more towards natural
| language queries now; probably because the old Google-Fu way of
| doing things wasn't very friendly for people who didn't use
| computers regularly.
| klyrs wrote:
| My understanding is that google went from a more traditional
| database style which supported such queries, to a newer
| "n-gram" index with a layer of semantic similarity. Notably,
| you can no longer put a sentence in quotes to only find pages
| that contain that exact phrase. Also, the order of words
| matters more now than it used to (where the old search
| engines treated a space as AND, so order was irrelevant
| outside of quotes)
| saalweachter wrote:
| https://www.google.com/search?q=%22you+can+no+longer+put+a+
| s...
| klyrs wrote:
| Hah, perhaps I should edit that to say "reliably."
| interstice wrote:
| If someone brought back a search engine like this i'd happily
| use it
| heavyset_go wrote:
| Sounds like a success if that means people see more ads while
| trying to find what they actually searched for.
| __loam wrote:
| Nationalize Google.
|
| Nothing will change as long as search is optimized for
| revenue over user value.
| loupol wrote:
| Agreed that web search quality has been deteriorating since
| much earlier than LLMs gaining popularity.
|
| Interestingly, we are in spot right now where I feel that for
| certain types of queries LLMs can outperform search engines.
| But from what is shown in the article, it seems like that state
| might only be temporary, and that in the same way that shitty
| content farms mastered SEO and polluted search results, we
| might see the same happening with LLMs that have access to the
| Internet.
___________________________________________________________________
(page generated 2023-10-05 23:00 UTC)