[HN Gopher] Search tool that only returns content created before...
       ___________________________________________________________________
        
       Search tool that only returns content created before ChatGPT's
       public release
        
       Author : dmitrygr
       Score  : 822 points
       Date   : 2025-12-01 04:06 UTC (18 hours ago)
        
 (HTM) web link (tegabrain.com)
 (TXT) w3m dump (tegabrain.com)
        
       | johng wrote:
       | I don't know how this works under the hood but it seems like no
       | matter how it works, it could be gamed quite easily.
        
         | cryzinger wrote:
         | If it's just using Google search "before <x date>" filtering I
         | don't _think_ there 's a way to game it... but I guess that
         | depends on whether Google uses the date that it indexed a page
         | versus the date that a page itself declares.
        
           | madars wrote:
           | Date displayed in Google Search results is often the self-
           | described date from the document itself. Take a look at this
           | "FOIA + before Jan 1, 1990" search: https://www.google.com/se
           | arch?q=foia&tbs=cdr:1,cd_max:1/1/19...
           | 
           | None of these documents were actually published on the web by
           | then, incl., a Watergate PDF bearing date of Nov 21, 1974 -
           | almost 20 years before PDF format got released. Of course,
           | WWW itself started in 1991.
           | 
           | Google Search's date filter is useful for finding documents
           | _about_ historical topics, but unreliable for proving when
           | information actually became publicly available online.
        
             | littlestymaar wrote:
             | Are you sure it works the same way for documents that
             | Google indexed at the time of publication? (Because
             | obviously for things that existed before Google, they had
             | to accept the publication date at face value).
        
               | madars wrote:
               | Yes, it works the same way even for content Google
               | indexed at publication time. For example, here are
               | chatgpt.com links that Google displays as being from
               | 2010-2020, a period when Google existed but ChatGPT did
               | not:
               | 
               | https://www.google.com/search?q=site%3Achatgpt.com&tbs=cd
               | r%3...
               | 
               | So it looks like Google uses inferred dates over its own
               | indexing timestamps, even for recently crawled pages from
               | domains that didn't exist during the claimed date range.
        
               | littlestymaar wrote:
               | Interesting, thanks.
               | 
               | I wonder why they do that when they could use time of
               | first indexing instead.
        
         | qwertygnu wrote:
         | True, but there's probably many ways to do this and unless AI
         | content starts falsifying tons of its metadata (which I'm sure
         | would have other consequences), there's definitely a way.
         | 
         | Plus other sites that link to the content could also give away
         | it's date of creation, which is out of the control of the AI
         | content.
        
           | layman51 wrote:
           | I have heard of a forum (I believe it was Physics Forums)
           | which was very popular in the older days of the internet
           | where some of the older posts were actually edited so that
           | they were completely rewritten with new content. I forget
           | what the reasoning behind it was, but it did feel shady and
           | unethical. If I remember correctly, the impetus behind it was
           | that the website probably went under new ownership and the
           | new owners felt that it was okay to take over the accounts of
           | people who hadn't logged on in several years and to
           | completely rewrite the content of their posts.
           | 
           | I believe I learned about it through HN, and it was this blog
           | post: https://hallofdreams.org/posts/physicsforums/
           | 
           | It kind of reminds me of why some people really covet older
           | accounts when they are trying to do a social engineering
           | attack.
        
             | joshuaissac wrote:
             | > website probably went under new ownership
             | 
             | According to the article, it was the founder himself who
             | was doing this.
        
         | CGamesPlay wrote:
         | "Gamed quite easily" seems like a stretch, given that the
         | target is definitionally not moving. The search engine is
         | fundamentally searching an immutable dataset that "just" needs
         | to be cleaned.
        
           | johng wrote:
           | How? They have an index from a previous date and nothing new
           | will be allowed since that date? A whole copy of the
           | internet? I don't think so.... I'm guessing, like others,
           | it's based on the date the user/website/blog lists in the
           | post. Which they can change at any time.
        
             | fragmede wrote:
             | Yes they do. It's called common crawl, and is available
             | from your chosen hyperscaler vendor.
        
       | 1gn15 wrote:
       | Does this filter out traditional SEO blogfarms?
        
         | JKCalhoun wrote:
         | Yeah, might prefer AI-slop to marketing-slop.
        
           | al_borland wrote:
           | They are the same. I was looking for something and tried AI.
           | It gave me a list of stuff. When I asked for its sources, it
           | linked me to some SEO/Amazon affiliate slop.
           | 
           | All AI is doing is making it harder to know what is good
           | information and what is slop, because it obscures the source,
           | or people ignore the source links.
        
             | venturecruelty wrote:
             | I've started just going to more things in person, asking
             | friends for recommendations, and reading more books
             | (should've been doing all of these anyway). There are some
             | niche communities online I still like, and the fediverse is
             | really neat, but I'm not sure we can stem the Great Pacific
             | Garbage Patch-levels of slop, at this point. It's really
             | sad. The web, as we know and love it, is well and truly
             | dead.
        
       | swyx wrote:
       | somebody said once we are mining "low-background tokens" like we
       | are mining low-background (radiation) steel post WW2 and i
       | couldnt shake the concept out of my head
       | 
       | (wrote up in https://www.latent.space/i/139368545/the-concept-of-
       | low-back... - but ironically repeating something somebody else
       | said online is kinda what i'm willingly participating in, and
       | it's unclear why human-origin tokens should be that much higher
       | signal than ai-origin ones)
        
         | jeffchuber wrote:
         | that was me swyx
        
           | rollulus wrote:
           | Multiple people have coined the idea repeatedly, way before
           | you. The oldest comment on HN I could find was in December
           | 2022 by user spawarotti:
           | https://news.ycombinator.com/item?id=33856172
        
             | threeducks wrote:
             | Here is an even older comment chain about it from 2020:
             | https://news.ycombinator.com/item?id=23895706
             | 
             | Apparently, comparing low-background steel to pre-LLM text
             | is a rather obvious analogy.
        
               | rollulus wrote:
               | Oh wow, great find! That's really early days.
        
               | pseidemann wrote:
               | As well as that people often do think alike.
               | 
               | If you have a thought, it's likely it's not new.
        
             | jeffchuber wrote:
             | i didnt claim to invent it.
             | 
             | i claimed swyx heard it through me - which he did
        
               | swyx wrote:
               | you did!!
        
         | jrjfjgkrj wrote:
         | every human generation built upon the slop of the previous one
         | 
         | but we appreciated that, we called it "standing on the
         | shoulders of giants"
        
           | rebuilder wrote:
           | That's because the things we built on weren't slop
        
           | walrusted wrote:
           | the only structure you can build with slop is a burial mound
        
             | Dilettante_ wrote:
             | What is unhardened concrete but slop?
        
           | hoppp wrote:
           | You can't build on slop because slop is a slippery slope
        
             | Dilettante_ wrote:
             | Maybe we'll have to slurp the slop so we don't slip on the
             | slope.
        
           | kgwgk wrote:
           | Nothing conveys better the idea of a solid foundation to
           | build upon than the word 'slop'.
        
           | teiferer wrote:
           | Because the pyramids, the theory of general relativity and
           | the Linux kernel are all totally comparable to ChatGPT
           | output. /s
           | 
           | Why is anybody still surprised that the AI bubble made it
           | that big?
        
             | jrjfjgkrj wrote:
             | for every theory of relativity the is the religious non-
             | sense and superstitions of the medieval ages or today
        
               | JumpCrisscross wrote:
               | > _for every theory of relativity the is the religious
               | non-sense and superstitions of the medieval ages or
               | today_
               | 
               | If Einstein came up with relativity by standing on "the
               | religious non-sense and superstitions of the medieval
               | ages," you'd have a point.
        
               | scotty79 wrote:
               | If we have billions of AIs one might pick the correct
               | learning materials. Same way human Einstein did.
        
               | flir wrote:
               | I know we're just pointlessly abusing the analogy here,
               | but... mediaeval cathedrals are a greater work of
               | artifice than pyramids.
        
           | bigiain wrote:
           | > we called it "standing on the shoulders of giants"
           | 
           | We do not see nearly so far though.
           | 
           | Because these days we are standing on the shoulders of giants
           | that have been put into a blender and ground down into a
           | slippery pink paste and levelled out to a statistically
           | typical 7.3mm high layer of goo.
        
             | _kb wrote:
             | The secret is you then have to heat up that goo. When the
             | temperature gets high enough things get interesting again.
        
               | pseidemann wrote:
               | Just simulate some evolution here and there.
        
               | gilleain wrote:
               | You get Flubber?
        
           | groestl wrote:
           | We have two optimization mechanisms though which reduce noise
           | with respect to their optimization functions: evolution and
           | science. They are implicitly part of "standing on the
           | shoulders of giants", you pick the giant to stand on (or it
           | is picked for you).
           | 
           | Whether or not the optimization functions align with human
           | survival, and thus our whole existence is not a slop, we're
           | about to find out.
        
           | shevy-java wrote:
           | This sounds like an Alan Kay quote. He meant that in regards
           | to useful inventions. AI-generated spam just decreases the
           | quality. We'd need a real alternative to this garbage from
           | Google but all the other search engines are also bad. And
           | their UI is also horrible - not as bad as Google, but also
           | bad. Qwant just tries to copy/paste Google for instance
           | (though interestingly enough, sometimes it has better results
           | than Google - but also fewer in general, even ignornig false
           | positive results).
        
             | visarga wrote:
             | Deep Research reports I think are above average internet
             | quality, they collect hundreds of sources, synthesize and
             | contrast them & provide backlinks. Almost like a generative
             | wikipedia.
             | 
             | I think all we can expect from internet information is a
             | good description of the distribution of materials out
             | there, not truth. This is totally within the capabilities
             | of LLMs. For additional confidence run 3 reports on
             | different models.
        
           | ben_w wrote:
           | There's a reason this is comedy:                 Listen, lad.
           | I built this kingdom up from nothing. When I started here,
           | all there was was swamp. Other kings said I was daft to build
           | a castle on a swamp, but I built it all the same, just to
           | show 'em. It sank into the swamp. So, I built a second one.
           | That sank into the swamp. So, I built a third one. That
           | burned down, fell over, then sank into the swamp, but the
           | fourth one... stayed up! And that's what you're gonna get,
           | lad: the strongest castle in these islands.
           | 
           | While this is religious:                 [24] "Everyone then
           | who hears these words of mine and does them will be like a
           | wise man who built his house on the rock. [25] And the rain
           | fell, and the floods came, and the winds blew and beat on
           | that house, but it did not fall, because it had been founded
           | on the rock. [26] And everyone who hears these words of mine
           | and does not do them will be like a foolish man who built his
           | house on the sand. [27] And the rain fell, and the floods
           | came, and the winds blew and beat against that house, and it
           | fell, and great was the fall of it."
           | 
           | Humans build not on each other's slop, but on each other's
           | success.
           | 
           | Capitalism, freedom of expression, the marketplace of ideas,
           | democracy: at their best these things are ways to bend the
           | wisdom of the crowds (such as it is) to the benefit of all;
           | and their failures are when crowds are not wise.
           | 
           | The "slop" of capitalism is polluted skies, soil and water,
           | are wage slaves and fast fashion that barely lasts one use,
           | and are the reason why workplace health and safety rules are
           | written in blood. The "slop" of freedom of expression
           | includes dishonest marketing, libel, slander, and propaganda.
           | The "slop" of democracy is populists promising everything to
           | everyone with no way to deliver it all. The "slop" of the
           | marketplace of ideas is every idiot demanding their own un-
           | informed rambling be given the same weight as the considered
           | opinions of experts.
           | 
           | None of these things contributed our social, technological,
           | or economic advancement, they are simply things which
           | happened at the same time.
           | 
           | AI has stuff to contribute, but using it to make an endless
           | feed of mediocrity is not it. The flood of low-effort GenAI
           | stuff filling feeds and drowning signal with noise, as others
           | have said: just give us your prompt.
        
           | pseidemann wrote:
           | You may have one point.
           | 
           | The industrial age was built on dinosaur slop, and they were
           | giant.
        
           | Mistletoe wrote:
           | How to make fire or kill a woolly mammoth was not slop come
           | on.
        
         | mwidell wrote:
         | Low background steel is no longer necessary.
         | 
         | "...began to fall in 1963, when the Partial Nuclear Test Ban
         | Treaty was enacted, and by 2008 it had decreased to only 0.005
         | mSv/yr above natural levels. This has made special low-
         | background steel no longer necessary for most radiation-
         | sensitive uses, as new steel now has a low enough radioactive
         | signature."
         | 
         | https://en.wikipedia.org/wiki/Low-background_steel
        
           | juvoly wrote:
           | Interesting. I guess that analogously, we might find that X
           | years after some future AI content production ban, we could
           | similarly start ignoring the low background token issue?
        
             | actionfromafar wrote:
             | We used a rather low number of atmospheric bombs, while we
             | are carpet bombing the internet every day with AI marketing
             | copy.
        
               | MadnessASAP wrote:
               | The eternal September has finally ended. We've now
               | entered the AI winter. It promises to be long, dark, and
               | full of annoyances.
        
               | embedding-shape wrote:
               | "Winter" in AI (or cryptocurrency, or any at all)
               | ecosystems denote a period of low activity, and a focus
               | on fundamentals instead of driven by hype.
               | 
               | What we're seeing now is something more like the peak of
               | summer. If it ends up being a bubble, and it burtst, some
               | months after that will be "AI Winter" as investors won't
               | want to continue chucking money at problems anymore, and
               | it'll go back to "in the background research" again, as
               | it was before.
        
               | MadnessASAP wrote:
               | It was a continuation of the nuclear analogy, a nuclear
               | winter following a large scale nuclear exchange.
               | 
               | Also that winter comes after September (fall)
        
               | SecretDreams wrote:
               | We're bombing the internet into extinction. But we were
               | way before AI. It got real bad during the
               | SEO/monetization phase. AI was just the final nail.
        
             | piker wrote:
             | What's the half-life of a viral meme?
        
           | doe88 wrote:
           | Can't wait, in fifty years we will have our data clean again.
        
         | alansaber wrote:
         | Since synthetic data for training is pretty ubiquitous seems
         | like a novelty
        
       | anticensor wrote:
       | You should call it Predecember, referring to the eternal
       | December.
        
         | unfunco wrote:
         | September?
        
           | littlestymaar wrote:
           | ChatGPT was released exactly 3 years ago (on the 30th of
           | November) so December it is in this context.
        
             | permo-w wrote:
             | surely that would be eternal November then
        
               | littlestymaar wrote:
               | No, being released on Nov 30th means November was still
               | before the slop era.
        
               | retsibsi wrote:
               | In the end the analogy doesn't really work, because
               | 'eternal September' referred to what used to be a
               | regular, temporary thing (an influx of noobs disrupting
               | the online culture, before eventually leaving or
               | assimilating) becoming the new normal. 'Eternal {month
               | associated with ChatGPT}' doesn't fit because LLM-
               | generated content was never a periodic phenomenon.
        
               | hackable_sand wrote:
               | AI R&D certainly _was_ periodic. Good thing we put a stop
               | to that!
        
               | permo-w wrote:
               | to be honest, GPT-3, which was pretty solid and extremely
               | capable of producing webslop, had been out for a good
               | while before ChatGPT, and GPT-2 even had been being used
               | for blogslop years before. maybe ChatGPT was the
               | beginning of when the public became aware of it, but it
               | was going on well beforehand. and, as the sibling
               | commenter points out, the analogy doesn't quite fit
               | structurally either
        
               | AlecSchueler wrote:
               | Yes, and this site is for everything before the slop era,
               | hence eternal November.
        
               | 123malware321 wrote:
               | everything is dead after november passes
        
             | anticensor wrote:
             | aka 0 December
        
       | themanmaran wrote:
       | The low-background steel of the internet
       | 
       | https://en.wikipedia.org/wiki/Low-background_steel
        
         | HelloUsername wrote:
         | As mentioned half a year ago at
         | https://news.ycombinator.com/item?id=44239481
        
           | thm wrote:
           | As mentioned 7 months ago
           | https://news.ycombinator.com/item?id=43811732
        
             | Ginger-Pickles wrote:
             | As mentioned in this thread :P
             | https://news.ycombinator.com/item?id=46103662
        
       | tkgally wrote:
       | Somewhat related, the leaderboard of em-dash users on HN before
       | ChatGPT:
       | 
       | https://www.gally.net/miscellaneous/hn-em-dash-user-leaderbo...
        
         | maplethorpe wrote:
         | They should include users who used a double hyphen, too -- not
         | everyone has easy access to em dashes.
        
           | venturecruelty wrote:
           | Oof, I feel like you'll accidentally capture a lot of
           | getopt_long() fans. ;)
        
             | Kinrany wrote:
             | Excluding those with asymmetrical whitespace around might
             | be enough
        
           | gblargg wrote:
           | Does AI use double hyphens? I thought the point was to find
           | who wasn't AI that used proper em dashes.
        
             | jader201 wrote:
             | Anytime I do this -- and I did it long before AI did --
             | they are always em dashes, because iOS/macOS translates
             | double dashes to em dashes.
             | 
             | I think there may be a way to disable this, but I don't
             | care enough to bother.
             | 
             | If people want to think my posts are AI generated, oh well.
        
               | teiferer wrote:
               | There is also the difference in using space around em-
               | dashes.
        
               | JumpCrisscross wrote:
               | > _Anytime I do this -- and I did it long before AI did
               | -- they are always em dashes_
               | 
               | It depends if you put the space before and after the
               | dashes--that, to be clear, are meant to be there--or if
               | you don't.
        
               | oniony wrote:
               | I cannot remember ever reading a book where there was a
               | space around the dashes.
        
               | kuschku wrote:
               | That depends on the language -- whereas German puts
               | spaces around --, English afaik usually doesn't.
               | 
               | Similarly, French puts spaces _before_ and after ? !
               | while English and German only put spaces afterwards.
               | 
               | [EDIT: I originally wrote that French treats . , ! ?
               | specially. In reality, french only treats ? and !
               | specially.]
        
               | iLoveOncall wrote:
               | French doesn't put one before the period.
        
               | bratwurst3000 wrote:
               | french does "," and "." like the british and germans the
               | rest is space befor space after
        
               | greenicon wrote:
               | In German you use en-dashes with spaces, whereas in
               | English it's em-dashes without spaces. Some people
               | dislike em-dashes in English though and use en-dashes
               | with spaces as well.
        
               | JumpCrisscross wrote:
               | > _whereas in English it's em-dashes without spaces_
               | 
               | Didn't know! Woot, I win!
               | 
               | Why does AI have a preference for doing it differently?
        
               | dragonwriter wrote:
               | In English, typically em-dashes are set without spaces or
               | with thin spaces when used to separate
               | appositives/parentheticals (though that style isn't
               | universal even in professional print, there are places
               | that aet them open, and en-dashes set open can also be
               | used in this role); when representating an interruption,
               | they generally have no space before but frequently have
               | space following. And other uses have other patterns.
        
               | bloak wrote:
               | In British English en-dashes with spaces is more common
               | than em-dashes without spaces, I think, but I don't have
               | any data for that, just a general impression.
        
               | LoganDark wrote:
               | Technically, there are supposed to be _hair spaces_
               | around the dashes, not regular spaces. They 're small
               | enough to be sometimes confused for kerning.
        
               | cachius wrote:
               | Em dashes used as parenthetical dividers, and en dashes
               | when used as word joiners, are usually set continuous
               | with the text. However, such a dash can optionally be
               | surrounded with a hair space, U+200A, or thin space,
               | U+2009 or HTML named entities &hairsp; and &thinsp; These
               | spaces are much thinner than a normal space (except in a
               | monospaced (non-proportional) font), with the hair space
               | in particular being the thinnest of horizontal whitespace
               | characters.
               | 
               | https://en.wikipedia.org/wiki/Whitespace_character#Hair_s
               | pac...
               | 
               | Typographers usually add space to the left side of the
               | following marks:                   : ; " ' ! ? / ) ] } *
               | ? > >> @ (r) (tm)  deg ! ' " + + = / - - --
               | 
               | And they usually add space to the right of these:
               | " ' / ( [ { > >= < <= PS $ C/ EUR < << [?] m # @ + = / -
               | - --
               | 
               | https://www.smashingmagazine.com/2020/05/micro-
               | typography-sp...
               | 
               | 1. (letterpress typography) A piece of metal type used to
               | create the narrowest space. 2. (typography, US) The
               | narrowest space appearing between letters and
               | punctuation.
               | 
               | https://en.wiktionary.org/wiki/hair_space
               | 
               | Now I'd like to see how the metal type looks like, but
               | ehm... it's difficult googling it. Also a whole
               | collection of space types and what they're called in
               | other languages.
        
               | fragmede wrote:
               | What, no love for our friend the en-dash?
               | 
               | - vs - vs --
        
               | chickensong wrote:
               | I once spent a day debugging some data that came from an
               | English doc written by someone in Japan that had been
               | pasted into a system and caused problems. Turned out to
               | be an en-dash issue that was basically invisible to the
               | eye. No love for en-dash!
        
               | ben_w wrote:
               | Similar.
               | 
               | Compiler error while working on some ObjC. Nothing
               | obviously wrong. Copy-pasted the line, same thing on the
               | copy. Typed it out again, no issue with the re-typed
               | version. Put the error version and the ok version next to
               | each other, apparently identical.
               | 
               | I ended up discovering I'd accidentally lent on the
               | option key while pressing the "-"; Monospace font, Xcode,
               | m-dash and minus looked identical.
        
               | 1718627440 wrote:
               | This issue also exists with (so called) "smart" quotes.
        
               | fragmede wrote:
               | Which, the iOS keyboard "helpfully" uses for you.
        
               | withinboredom wrote:
               | Especially when you're sending some quick scratch code in
               | a slack message.
        
               | mh- wrote:
               | Pretty much the first thing I turn off on a new laptop
               | (it's in the keyboard settings on iOS too.)
        
           | bigiain wrote:
           | That would false positive me. I have used double dashes to
           | delimit quote attribution for decades.
           | 
           | Like this:
           | 
           | "You can't believe everything you read on the internet." --
           | Abraham Lincoln, personal correspondence, 1863
        
             | dragonwriter wrote:
             | That's literally a standard use of em-dash being
             | approximated by a double hyphen, though.
        
           | SoftTalker wrote:
           | Double-hyphen is an en-dash. Triple-hyphen is an em-dash.
        
             | dragonwriter wrote:
             | Double hyphen is replaced in some software with an en-dash
             | (and in those, a triple hyphen is often replaced with an
             | em-dash), and in some with an em-dash; its usually used
             | (other than as input to one of those pieces of software) in
             | places where an em-dash would be appropriate, but in
             | contexts where both an em-dash set closed and an en-dash
             | set open might be used, it is often set open.
             | 
             | So, it's not unambiguously s substitute for either is
             | essentially its own punctuation mark used in ASCII-only
             | environments with some influence from both the use of em-
             | dashed and that of en-dashes in more formal environments.
        
         | a5c11 wrote:
         | Apparently, it's not only em-dash that's distinctive. I've went
         | through comments of the leader, and spot he also uses the
         | backtick "'" instead of the apostrophe.
        
           | kuschku wrote:
           | I (~100 in the leaderboard, regardless of how you sort) also
           | frequently use ' (unicode apostrophe) instead of ' :D
        
           | baiwl wrote:
           | Just to be clear this is done automatically by macOS or iOS
           | browsers when configured properly.
        
         | lxgr wrote:
         | Amazing! But no love for en dashes?
        
       | GaryBluto wrote:
       | Why use this when you can use the before: syntax on most search
       | engines?
        
         | aDyslecticCrow wrote:
         | doesn't actually do anything anymore in Google or bing.
        
           | Thorrez wrote:
           | Searching Google for
           | 
           | chatgpt
           | 
           | vs
           | 
           | chatgpt before:2022-01-01
           | 
           | give me quite different results. In the 2nd query, most
           | results have a date listed next to them in the results page,
           | and that date is always prior to 2022. So the date filtering
           | is "working". However, most of the dates are actually Google
           | making a mistake and misinterpreting some unimportant date it
           | found on the page as the date the page was created. At least
           | one result is a Youtube video posted before 2022, that edited
           | its title after Chatgpt was released to say Chatgpt.
           | 
           | Disclosure: I work at Google, but not on search.
        
       | progman32 wrote:
       | Not affiliated, but I've been using kagi's date range filter to
       | similar effect. The difference in results for car maintenance
       | subjects is astounding (and slightly infuriating).
        
       | permo-w wrote:
       | besides for training future models, is this really such a big
       | deal? most of the AI-gened text content is just replacing
       | content-farm SEO-spam anyway. the same stuff that any half-awares
       | person wouldn't have read in the past is now slightly better
       | written, using more em dashes and instances of the word "delve".
       | if you're consistently being caught out by this stuff then likely
       | you need to improve your search hygiene, nothing so drastic as
       | this
       | 
       | the only place I've ever had any issue with AI content is
       | r/chess, where people love to ask ChatGPT a question and then
       | post the answer as if they wrote it, half the time seemingly
       | innocently, which, call me racist, but I suspect is mostly due to
       | the influence of the large and young Indian contingent. otherwise
       | I really don't understand where the issue lies. follow the exact
       | same rules you do for avoiding SEO spam and you will be fine
        
         | system2 wrote:
         | Yes indeed, it is a problem. Now the old good sites have turned
         | into AI-slop sites because they can't fight the spammers by
         | writing slowly with humans.
        
         | pajamasam wrote:
         | SEO-spam was often at least somewhat factual and not complete
         | generated garbage. Recipe sites, for example, usually have a
         | button that lets you skip the SEO stuff and get to the actual
         | recipe.
         | 
         | Also, the AI slop is covering almost every sentence or phrase
         | you can think of to search. Before, if I used more niche search
         | phrases and exact searches, I was pretty much guaranteed to get
         | specific results. Now, I have to wade through pages and pages
         | of nonsense.
        
         | Cadwhisker wrote:
         | In the past, I'd find one wrong answer and I could easily spot
         | the copies. Now there's a dozen different sites with the same
         | wrong answer, just with better formatting and nicer text.
        
           | finaard wrote:
           | The trick is to only search for topics where there are no
           | answers, or only one answer leading to that blog post you
           | wrote 10 years ago and forgot about.
        
         | zwnow wrote:
         | Yes it is a big deal. I cant find new artists without having a
         | fear of their art being AI generated, same for books and music.
         | I also cant post my stuff to the internet anymore because I
         | know its going to be fed into LLM training data. The internet
         | is dead to me mostly and thankfully I lost almost all interest
         | of being on my computer as much as I used to be.
        
         | darkwater wrote:
         | > besides for training future models, is this really such a big
         | deal? most of the AI-gened text content is just replacing
         | content-farm SEO-spam anyway.
         | 
         | Yes, it is because of the other side of the coin. If you are
         | writing human-generated, curated content, previously you would
         | just do it in your small patch of Internet, and probably SEs
         | (Google...) will pick it up anyway because it was good quality
         | content. You just didn't care about SEO-driven shit anyway. Now
         | you nicely hand-written content is going to be fed into LLM
         | training and it's going to be used - whatever you want it or
         | not - in the next generation of AI slop content.
        
           | visarga wrote:
           | It's not slop if it is inspired from good content. Basically
           | you need to add your original spices into the soup to make it
           | not slop, or have the LLM do deep research kind of work to
           | contrast among hundreds of sources.
           | 
           | Slop did not originate from AI itself, but from the feed
           | ranking Algorithm which sets the criteria for visibility.
           | They "prompt" humans to write slop.
           | 
           | AI slop is just an extension of this process, and it started
           | long before LLMs. Platforms optimizing for their own interest
           | at the expense of both users and creators is the source of
           | slop.
        
         | never_inline wrote:
         | A colleague sent me a confident ChatGPT formatted bug report.
         | 
         | It misidentified what the actual bug was.
         | 
         | But the tone was so confident, and he replied to my later
         | messages using chat gpt itself, which insisted I was wrong.
         | 
         | I don't like this future.
        
           | blitzar wrote:
           | I have dozens of these over the years - many of the people
           | responsible have "Head of ..." or "Chief ..." job titles now.
        
           | artursapek wrote:
           | Did you call his ass out for being lazy and wasting your
           | time?
        
           | crazygringo wrote:
           | It's not the future. Tell him not to do that. If it happens
           | again, bring it to the attention of his manager. Because
           | that's not what he's being paid for. If he continues to do
           | it, that's grounds for firing.
           | 
           | What you're describing is not the future. It's a fireable
           | offense.
        
         | Aurornis wrote:
         | > the only place I've ever had any issue with AI content is
         | r/chess, where people love to ask ChatGPT a question and then
         | post the answer as if they wrote it, half the time seemingly
         | innocently
         | 
         | Some of the science, energy, and technology subreddits receive
         | a lot of ChatGPT repost comment. There are a lot of people who
         | think they've made a scientific or philosophical breakthrough
         | with ChatGPT and need to share it with the world.
         | 
         | Even the /r/localllama subreddit gets constant AI spam from
         | people who think they've vibecoded some new AI breakthrough.
         | There have been some recent incidents where someone posted
         | something convincing and then others wasted a lot of time until
         | realizing the code didn't accomplish what the post claimed it
         | did.
         | 
         | Even on HN some of the "Show HN" posts are AI garbage from
         | people trying to build portfolios. I wasted too much time
         | trying to understand one of them until I realized they had
         | (unknowingly?) duplicated some commits from upstream project
         | and then let the LLM vibe code a README that sounded like an
         | amazing breakthrough. It was actually good work, but it wasn't
         | theirs. It was just some vibecoding tool eventually arriving at
         | the same code as upstream and then putting the classic LLM
         | written, emoji-filled bullet points in the README
        
       | tobr wrote:
       | For images, https://same.energy is a nice option that, being
       | abandoned but still functioning since a few years, seems to
       | naturally not have crawled any AI images. And it's all around a
       | great product.
        
       | voiper1 wrote:
       | Of course my first thought was: Let's use this as a tool for AI
       | searches (when I don't need recent news).
        
       | ricardo81 wrote:
       | FWIW Mojeek (an organic search engine in the classic sense) can
       | do this with the before: operator.
       | 
       | https://www.mojeek.com/search?q=britney+spears+before%3A2010...
        
       | pknerd wrote:
       | Something generated by humans does not mean high quality.
        
         | Krssst wrote:
         | Yes, but AI-generated is always low quality so it makes sense
         | to filter it out.
        
           | IshKebab wrote:
           | I wouldn't say _always_... Especially because you probably
           | only noticed the bad slop. Usually it is crap though.
        
           | josephjrobison wrote:
           | Grokipedia would like a word
        
         | a5c11 wrote:
         | At least when reading a human-made material you can spot
         | author's uncertainty in some topics. Usually, when someone
         | doesn't have knowledge of something, he doesn't try to describe
         | that. AI, however, will try to convince you that pigs can fly.
        
       | cryptozeus wrote:
       | technically you can ask chatgpt to return the same result by
       | asking it to filter by year
        
       | RomanPushkin wrote:
       | For that purpose I do not update my book on LeanPub about Ruby. I
       | just know one day people gonna read it more, because human-
       | written content would be gold.
        
       | zkmon wrote:
       | Most of college courses and school books haven't changed in
       | decades. Some reputed college keep courses for Pascal and Fortran
       | instead of Python or Java, just because, it might affect their
       | reputation of being classical or pure or to match their campus
       | buildings style.
        
         | fastasucan wrote:
         | Or because the core knowledge stay the same no matter how it is
         | expressed.
        
       | defraudbah wrote:
       | ChatGPT also returns content only created before ChatGPT release,
       | which is why I still have to google damn it!
        
         | fragmede wrote:
         | Click the globe icon below the input box to enable web
         | searching by ChatGPT.
        
         | stinos wrote:
         | Is that still the case? And even if so how is it going to avoid
         | keeping it like that in the future? Are they going to stop
         | scraping new content, or are they going to filter it with a
         | tool which recognizes their own content?
        
           | defraudbah wrote:
           | it's a known problem in ML, I think grok solved it partially
           | and chatGPT uses another model on top to search web like
           | suggested below. Hence MLOps field appeared, to solve models
           | management
           | 
           | I find it a bit annoying to navigate between hallucinations
           | and outdated content. Too much invalid information to filter
           | out.
        
       | ETH_start wrote:
       | I'm grateful that I published a large body of content pre-ChatGPT
       | so that I have proof that I'm not completely inarticulate without
       | AI.
        
       | phplovesong wrote:
       | The slop is getting worse, as there is so much llm generated shit
       | online, now new models are getting trained on the slop. Slop
       | training slop, and slop. We have gone full circle just in a
       | matter of a few years.
        
         | muixoozie wrote:
         | I was replaying Cyberpunk 2077 and trying to think of all the
         | ways one might have dialed up the dystopia to 11 (beyond what
         | the game does). And pervasive AI slop was never on my radar.
         | Kinda reminds me of the foreword in Neuromancer bringing
         | attention to the fact the book was written before cellphones
         | became popular. It's already fucking with my mind. I recently
         | watched Frankenstein 2025 and 100% thought gen ai had a role in
         | the CGI only to find out the director hates it so much he
         | rather die than use it. I've been noticing little things in old
         | movies and anime where I thought to myself (if I didn't know
         | this was made before gen ai, I would have thought this was
         | generated for sure). One example
         | (https://www.youtube.com/watch?v=pGSNhVQFbOc&t=412) cityscape
         | background in this a outro scene with buildings built on top of
         | buildings gave me ai vibes (really the only thing in this whole
         | anime), yet this came out ~1990. So I can already recognize a
         | paranoia / bias in myself and really can't reliably tell what's
         | real.. Probably also other people have this and why some non-
         | zero number of people always thinks every blog post that comes
         | out was written by gen ai.
        
           | Cyan488 wrote:
           | I had the same experience, watching a nature documentary on a
           | streaming service recently. It was... not so good, at least
           | at the beginning. I was wondering if this was a pilot for AI
           | generated content on this streaming service.
           | 
           | Actually, it came out in 2015 and was just low budget.
        
       | dinkblam wrote:
       | google results were already 90% SEO crap long before ChatGPT
       | 
       | just use Kagi and block all SEO sites...
        
         | paweladamczuk wrote:
         | How do we (or Kagi) know which ones are "SEO sites"? Is there
         | some filter list or other method to determine that?
        
           | Jolter wrote:
           | If you took Google of 2006, and used that iteration of the
           | pagerank algorithm, you'd probably not get most of the SEO
           | spam that's so prevalent in Google results today.
        
           | joshvm wrote:
           | It seems like a mixture of heuristics, explicit filtering and
           | user reports.
           | 
           | https://help.kagi.com/kagi/features/slopstop.html
           | 
           | That's specifically for AI generated content, but there are
           | other indicators like how many affiliate links are on the
           | page and how many other users have downvoted the site in
           | their results. The other aspect is network effect, in that
           | everyone tunes their sites to rank highly on Google. That's
           | presumably less effective on other indices?
        
       | shevy-java wrote:
       | > This is a search tool that will only return content created
       | before ChatGPT's first public release on November 30, 2022.
       | 
       | The problem is that Google's search engine - but, oddly enough,
       | ALL search engines - got worse before that already. I noticed
       | that search engines got worse several years before 2022. So, AI
       | further decreased the quality, but the quality had a downwards
       | trend already, as it was. There are some attempts to analyse this
       | on youtube (also owned by Google - Google ruins our digital
       | world); some explanations made sense to me, but even then I am
       | not 100% certain why Google decided to ruin google search.
       | 
       | One key observation I made was that the youtube search, was
       | copied onto Google's regular search, which makes no sense for
       | google search. If I casually search for a video on youtube, I may
       | be semi-interested in unrelated videos. But if I search on Google
       | search for specific terms, I am not interested in crap such as
       | "others also searched for xyz" - that is just ruining the UI with
       | irrelevant information. This is not the only example, Google made
       | the search results worse here and tries to confuse the user in
       | clicking on things. Plus placement of ads. The quality really
       | worsened.
        
         | justinclift wrote:
         | Are you aware of Kagi (kagi.com)?
         | 
         | With them, at least the AI stuff can be turned off.
         | 
         | Membership is presently about 61k, and seems to be growing
         | about 2k per month: https://kagi.com/stats
        
           | amelius wrote:
           | Be aware of:
           | 
           | https://www.reddit.com/r/SearchKagi/comments/1gvlqhm/disappo.
           | ..
        
             | justinclift wrote:
             | Damn. I didn't know that.
             | 
             | Now we need a 2nd Kagi, so we can switch to that one
             | instead. :(
        
             | smusamashah wrote:
             | There are few other powerful countries, with countless Web
             | services, who freely wages war(s) on other countries and
             | support wars in many different ways. Is there a way to
             | avoid their products?
        
               | jwr wrote:
               | Whataboutism doesn't get us anywhere -- saying "but what
               | about X" (insert anything for X here) usually results in
               | doing nothing.
               | 
               | Some of us would rather take a stand, imperfect as it is,
               | than just sit and do nothing. Especially in the very
               | clear case of someone (Kagi) doing business with a
               | country that invaded a neighboring country for no reason,
               | and keeps killing people there.
        
               | graemep wrote:
               | Why this particular stand? Is doing nothing any better
               | than taking what are essentially random stands? Obviously
               | if you are Ukrainian this will be an important stand to
               | you, but otherwise doing things based on a mix of what
               | the media you like focuses on or whatever is not really
               | very different from doing nothing.
        
               | baconbrand wrote:
               | Doing something is literally the opposite of doing
               | nothing. This is complete gibberish.
        
               | clucas wrote:
               | I think "no wars of conquest" is a bright line that was
               | crossed by Russia, that hasn't been crossed by other
               | nations in a long time. And I think it's important for
               | the whole world to take a stand on that, not just the
               | nation that was invaded. It's not a "random stand."
        
               | Alex2037 wrote:
               | >"no wars of conquest"
               | 
               | how about "no wars of genocide"? you know, like the one
               | the collective West had enthusiastically supported for a
               | while now?
        
               | AlecSchueler wrote:
               | Plenty of people boycott Israeli goods and there's an
               | increasing trend of moving away from reliance on American
               | services also.
        
               | DrammBA wrote:
               | did you just "but what about X" to the previous comment
               | which is the whole point of this thread?
        
               | jwr wrote:
               | I am amused by my (unpopular and downvoted by now)
               | comment by the scourge of "whataboutism" sparked a
               | discussion, where comments begin with "how about" :-)
               | 
               | That is exactly my point! Saying "but what about" is akin
               | to saying "you shouldn't do anything, because there is
               | another unrelated $thing happening elsewhere". I refuse
               | to follow this line of thinking.
        
               | clucas wrote:
               | I find it much easier to take a strong stand on
               | Russia/Ukraine than on Israel/Palestine. The history of
               | Israel/Palestine is much more of a gray area. Palestine
               | has used plenty of aggressive actions and rhetoric that
               | make Israel's actions more understandable (if not
               | justified).
               | 
               | Example of actions: Gaza invaded Israel and killed,
               | raped, and kidnapped civilians on October 7. Ukraine had
               | no such triggering event that caused Russia to invade.
               | 
               | Example of rhetoric: Gaza's political leaders have said
               | they want to destroy Israel. I don't think anyone in
               | power in Ukraine has said they want to destroy the
               | Russian state.
        
               | donkyrf wrote:
               | "enthusiastic support"
               | 
               | https://yougov.co.uk/international/articles/52279-net-
               | favour...
               | 
               | https://www.pewresearch.org/politics/2025/10/03/how-
               | american...
               | 
               | etc etc....
               | 
               | I'm not sure what collective West you're referring to;
               | but apparently it excludes every major Western European
               | nation, America, and Canada.
        
               | jwr wrote:
               | > Why this particular stand?
               | 
               | First, _any_ stand is better than whataboutism and just
               | sitting there doing nothing.
               | 
               | Second, this stand results from my thoughts. It is my
               | stand. There are many like it, but this one is mine.
               | 
               | Third, in the history of the modern world there were very
               | few black&white situations where there was one side which
               | was clearly the aggressor. This is one of them.
        
               | graemep wrote:
               | > First, any stand is better than whataboutism and just
               | sitting there doing nothing.
               | 
               | I definitely disagree with this. There are many cases
               | where you might take the wrong stand, especially where
               | you do not have detailed knowledge of the issue you re
               | taking a stand over.
        
               | artursapek wrote:
               | "whataboutism" is the reddit word for calling out
               | hypocrisy
        
               | mcv wrote:
               | As a European, I'm also increasingly in favour of
               | avoiding American companies. Especially the big
               | corrupting near-monopolists.
               | 
               | It's worth pointing out the flaws of all bad actors. The
               | more info we have, the more effectively we can act.
        
             | eirini1 wrote:
             | I don't agree with this logic. It implies that people who
             | use Google, Bing and a million other products made by US-
             | based companies are supportive of the huge amount of
             | attrocities commited or aided by the United States. Or
             | other countries. It feels very odd to single out Russia's
             | invasion of Ukraine but to minimize the Israeli genocide of
             | palestinians in Gaza, the multiple unjust wars waged by the
             | United States all over the world etc.
        
               | kortilla wrote:
               | Google doesn't censor those atrocities for the US
               | government. That's the key difference.
        
               | ssl-3 wrote:
               | It's often fairly easy to find US government-centric news
               | and criticism with Google.
               | 
               | But as one counterexample: The end of the US penny was
               | formed and announced not with public legislative
               | discourse, nor even with an executive order, but with a
               | brief social media post by the president.
               | 
               | And I don't mean that it's atrocious or anything, but I
               | wanted to see that social media post myself. Not a report
               | about it, or someone's interpretation of it, but -- you
               | know -- the actual utterance from the horse's mouth.
               | 
               | Which should be a simple matter. After all, it's the WWW.
               | 
               | And I've been Googling for as long as there has been a
               | Google to Google with. I'd like to think that I am
               | proficient at getting results from it.
               | 
               | But it was like pulling teeth to get Google to
               | eventually, kicking and screaming, produce a link to the
               | original message on Truth Social.
               | 
               | If that kind of active reluctance isn't censorship on
               | Google's part, then what might it be described as
               | instead?
               | 
               | And if they're seeking to keep me away from the root of
               | this very minor issue, then what else might they also be
               | working to keep me from?
        
               | baconbrand wrote:
               | It doesn't imply any of that at all.
               | 
               | There certainly is a huge army of people ready to spout
               | this sort of nonsense in response to anyone talking about
               | doing anything.
               | 
               | Hard to know what percentage of these folks are trying to
               | assuage their own guilt and what percentage are state
               | actors. Russia and Israel are very chronically online,
               | and it behooves us internet citizens to keep that in
               | mind.
        
             | super256 wrote:
             | Yandex has the best image search, and others are years
             | behind it. Further more Nebius has sold all group's
             | businesses in Russia and certain international market. They
             | are completely divested from Russia for a 1.5 years
             | already: https://nebius.com/newsroom/ynv-announces-
             | successful-complet...
             | 
             | The post you linked was posted when the divestment was
             | already going underway, so it is at least dishonest if not
             | malicious.
        
               | cluckindan wrote:
               | I wouldn't trust a divorce where one party still provides
               | for the other.
        
               | _heimdall wrote:
               | You don't "trust" a divorce is alimony was part of the
               | settlement?
        
               | kortilla wrote:
               | Yep, when the party paying can decide not to pay and
               | there are no teeth to extract payment, that gives immense
               | power to the payer.
        
               | _heimdall wrote:
               | At least in my area, there are legal avenues if alimony
               | goes unpaid. Assets can be seized to pay off late
               | payments and wages can be garnished.
               | 
               | Its a different story if the payer truly can't afford to
               | pay the alimony, but at that point they wouldn't have the
               | immense power you are concerned with.
        
               | varjag wrote:
               | Yandex is the government approved search engine in
               | Russia, which is impossible without the state exerting
               | control over it. I wouldn't pay much attention to
               | divestment, it's not how any of that works.
               | 
               | For instance here you can learn that Yandex NV is fully
               | controlled by a group of Russian investors: https://www.r
               | bc.ru/business/06/03/2024/65e7a0f29a7947609ea39...
        
               | oh_fiddlesticks wrote:
               | The government's where the offices of a software company
               | are physically located exert control over them. To follow
               | this logic to its end and apply it even handedly results
               | in nation based NIH syndrome surely?
        
               | varjag wrote:
               | You are talking about an entity whose ownership is 99.8%
               | Russian nationals and state companies; whose employees
               | for the most part are Russian nationals, whose main
               | market is Russia and with very little tangible assets
               | that can be arrested in the Netherlands. The only reason
               | for this "divestment" is sanctions evasion.
        
               | tryauuum wrote:
               | you clearly don't know anything about nebius
               | 
               | They have a lot of hardware in e.g. Finland. I don't
               | think they provide GPU access to the russian companies,
               | feel free to correct me
        
               | varjag wrote:
               | We were talking search engines here, but interesting
               | indeed! What's the name of Neibus CEO?
        
               | stopthe wrote:
               | Some clarification. Since 2024 Yandex NV split into
               | Nebius (NL-registred NASDAQ-listed company, no longer a
               | search engine) and russian-based Yandex. The latter is
               | fully controlled by russian investors.
        
               | Ylpertnodi wrote:
               | https://news.ycombinator.com/item?id=42349797 (11 months
               | ago)
               | 
               | https://som.yale.edu/story/2022/over-1000-companies-have-
               | cur...
               | 
               | You pays your money, you takes your choice.
        
               | hopelite wrote:
               | You are mistaken to think that zealots can be reasoned
               | with. They have been conditioned to react upon anything
               | "Russia" like a Pavlovian cue, a command of the trained
               | animal. They are a herd that moves as a herd, based on
               | cues of lead animals. No amount of proof or evidence will
               | ever dissuade them from a position that the herd is
               | moving in. They cannot reason on their own and lack the
               | courage to separate, let alone say something that the
               | herd disapproves of, lest they be expelled from the herd
               | and ganged up on.
        
             | spIrr wrote:
             | Thank you. Didn't know that and was, until now, considering
             | paying for a Kagi subscription.
        
             | scotty79 wrote:
             | > "We do not discriminate based on current geopolitical
             | issues."
             | 
             | That's one way of phrasing it.
        
             | artursapek wrote:
             | based Vlad tbh
        
             | Ferret7446 wrote:
             | I find this amusing, because it seems like Kagi's target
             | audience dislikes this (politically polarized), and I as
             | someone who is not Kagi's target audience likes this
             | (politically neutral).
        
               | embedding-shape wrote:
               | Wait, what? Their choice is specifically a politically
               | neutral one, wouldn't that mean their target audience is
               | a politically neutral one? Why is your impression that
               | Kagi's target audience is politically polarized users?
               | Been a paying user of Kagi for years, never got that
               | impression.
               | 
               | FWIW, I don't think Kagi should remove or avoid indexing
               | content from countries that invade others, because a lot
               | of the times websites in those countries have useful
               | information on them. If Kagi were to enact such a block,
               | it would mean it would no longer surface results from HN,
               | reddit and a bunch of other communities, effectively
               | making the search engine a lot less useful.
        
               | bawolff wrote:
               | Politics is not just a 1 dimensional line.
        
               | saturnite wrote:
               | Yeah, it's two dimensional. One axis goes from good to
               | evil. The other axis, chaotic to lawful.
        
               | Dilettante_ wrote:
               | There's a secret third dimension you can ascend to
               | through a hole in the neutral middle where the forces of
               | the other two axes cancel out. 'The Elites' doesn't want
               | you to know this.
               | 
               | /hj?
        
               | brendoelfrendo wrote:
               | Why is supporting Yandex, who are involved in Russian
               | politics and linked to the ruling regime, a neutral
               | decision? That is very much a political decision, in the
               | same way that working with US tech companies is a
               | political decision. You need to decide what you're
               | willing to tolerate and where your ethical lines are
               | drawn; the alternative isn't neutrality, it's nihilism.
        
               | lostlogin wrote:
               | Solution: Kagi as it is, but with a 'remove Yandex'
               | toggle. Even if it was a paid upgrade, I'd take it.
        
             | duxup wrote:
             | Yeah I kept thinking "man I should try kagi" and then that
             | :(
        
               | akie wrote:
               | Try it anyway.
        
               | alessioalex wrote:
               | He probably doesn't want to support genocide.
        
               | richwater wrote:
               | Hope he doesn't pay his taxes then considering where US
               | aid ends up
        
               | duxup wrote:
               | I pay my taxes, that's not optional. Paying search engine
               | is.
        
               | duxup wrote:
               | Naw, the well is poisoned and I question the company's
               | decision making at this point.
        
             | buellerbueller wrote:
             | Imo, Kagi is _still_ the better option, because it isn 't
             | supporting the global surveillance mechanism we call
             | advertising. All these people, missing the forest for the
             | single yandex tree.
        
             | stronglikedan wrote:
             | Meh. Most people, including myself, couldn't care less, and
             | Yandex image search is very capable.
        
             | troyvit wrote:
             | So if America invades Venezuela should we all stop using
             | google? Should we have stopped using google when the U.S.
             | invaded Iraq and killed 150,000 people[1]?
             | 
             | Should we stop using products imported from China for the
             | cultural genocide they've perpetrated against the
             | Uyghurs?[2]
             | 
             | Is Yandex Russia?
             | 
             | [1]
             | https://en.wikipedia.org/wiki/Casualties_of_the_Iraq_War
             | 
             | [2] https://en.wikipedia.org/wiki/Persecution_of_Uyghurs_in
             | _Chin...
        
               | brendoelfrendo wrote:
               | Honest answers are yes, yes, and yes. It may be
               | unavoidable for the average person to avoid imported
               | goods from China, but we should remain aware of our place
               | in the world and try where we can. If the US does invade
               | Venezuela, I sincerely hope that individuals and business
               | owners try to cut as many ties with complicit US tech
               | companies as possible. Honestly, with this clusterfuck of
               | war crimes going on over "drug boats," I hope they're
               | already starting.
        
               | alessioalex wrote:
               | You can take whatever stand you want. When there's a
               | country that killed, raped and tried to exterminate most
               | of Eastern Europe we can choose to cut any and all ties
               | with it and consider them for all intents and purposes
               | ..terrorists.
        
               | mcv wrote:
               | And the fact that there are other countries that should
               | also be considered terrorists, doesn't mean we shouldn't
               | boycott this one. It means we should boycott them all.
               | But boycotting a few is still better than nothing.
        
               | troyvit wrote:
               | I sort-of see where you're coming from, but it also
               | ignores a double standard to me. Don't buy search from a
               | company that uses an api from another company that is (or
               | was? unclear) based in a country that invaded another
               | country and completely upended the world order. For some
               | people that's a line that they don't want to cross and I
               | get it.
               | 
               | However if that's the case how can they continue buying
               | Chinese products when China has done the same thing, but
               | worse, and for longer, to their own population? Because
               | it's less convenient to stop? _That_ to me lands squarely
               | in the "take whatever stand you want" category with the
               | addendum of, "and don't worry if it doesn't make sense."
               | 
               | Is it because it's within their own borders and therefore
               | isn't our problem?
        
             | phantasmish wrote:
             | I _directly_ use Yandex sometimes, because there are huge
             | blind spots for all the US-based engines I 'm aware of, and
             | it fills some of them in.
             | 
             | If someone can point me to a better index for that purpose,
             | I'd love to avoid Yandex. Please inform me.
        
             | devmor wrote:
             | Kagi is based in the United States, as is YC.
             | 
             | If you are concerned about heinous war crimes and the
             | slaughter of civilians to the point that you don't want to
             | use private services from countries that conduct such acts,
             | you should avoid both already.
        
             | immibis wrote:
             | Why's that something to be aware of? Yandex is actually a
             | good search engine, so I'm told, as long as you don't
             | search for things related to Russian politics. Kagi
             | presumably knows this and won't use their results related
             | to Russian politics.
             | 
             | Feels more like a scare campaign to me - someone doesn't
             | want you to use Kagi, and points to Yandex as a reason for
             | that.
        
             | Seattle3503 wrote:
             | I'm surprised this is possible given the sanctions on
             | Russia.
        
           | dncornholio wrote:
           | How does Kagi know what is AI stuff? I don't see how they can
           | 'just turn it off'
        
             | Zambyte wrote:
             | It's driven by community ratings.
             | 
             | https://news.ycombinator.com/item?id=45919067
        
               | pratyahava wrote:
               | so it is like humans vs robots started? robots ask humans
               | questions to verify they are not robots. humans mark
               | content as robot-generated to filter it out.
        
               | pvdebbe wrote:
               | My first instinct is that users abuse it like they do any
               | other report/downvote mechanism. They see something they
               | just don't plain like, they report it as AI slop.
        
             | justinclift wrote:
             | By "turn it off" I mostly mean that Kagi have their own AI
             | driven tools available, but a toggle in your user settings
             | disables it completely.
             | 
             | ie it's not forced down your throat, nor
             | mysteriously/accidentally/etc turned back on occasionally
        
           | mebizzle wrote:
           | Haven't looked back since I signed up.
        
           | tempacct2cmmnt wrote:
           | I've had much better results with Kagi than with Google in
           | the past few months. I'd trialed them a couple times in the
           | past and been disappointed, but that's no longer the case.
        
           | PaulDavisThe1st wrote:
           | The AI stuff in google search can be turned off.
           | https://www.google.com/search?udm=14&q=kagi
           | 
           | My default browser search tool is set to google with ?udm=14
           | automatically appended.
        
             | nailer wrote:
             | What is UDM? Presumably the U is Urchin but what's the
             | rest?
        
               | PaulDavisThe1st wrote:
               | Never seen it documented.
        
         | Maken wrote:
         | There is also the fact that automatically generated content
         | predates ChatGPT by a lot. By around 2020 most Google searches
         | already returned lots of SEO-optimized pages made from scrapped
         | content or keyword soups made by rudimentary language models or
         | markov chains.
        
           | black3r wrote:
           | Well there's also the fact that GPT-3 API was released in
           | June 2020 and its writing capabilities were essentially on
           | par with ChatGPT initial release. It was just a bit harder to
           | use, because it wasn't yet trained to follow instructions, it
           | only worked as a very good "autocomplete" model, so prompting
           | was a bit "different" and you couldn't do stuff like "rewrite
           | this existing article in your own words" at all, but if you
           | just wanted to write some bullshit SEO spam from scratch it
           | was already as good as ChatGPT would be 2 years later.
        
             | wongarsu wrote:
             | Also the full release of GPT-2 in late 2019. While GPT-2
             | wasn't really "good" at writing, it was more than good
             | enough to make SEO spam
        
             | Maken wrote:
             | I didn't remember that, but it would explain the spam
             | exponential grow back then.
        
           | gield wrote:
           | And 10 years ago, Reddit was already experimenting with auto-
           | generated subreddits:
           | https://www.reddit.com/r/SubredditSimulator.
        
           | PunchyHamster wrote:
           | It was popular way before 2020 but Google managed to keep up
           | with SEO tricks for good decade+ before. Guess it got to
           | breaking point.
        
         | bratwurst3000 wrote:
         | the main theory is that with bad results you have to search
         | more and get more engaged in ads so more revenue for google.
         | Its enshitification
        
         | master-lincoln wrote:
         | I think this is about trustworthy content, not about a good
         | search engine per se
        
           | trinix912 wrote:
           | But it's not necessarily trustworthy content, we had
           | autogenerated listicles and keyword list sites before
           | ChatGPT.
        
             | GTP wrote:
             | Sure, but I think that the underlying assumption is that,
             | after the public release of ChatGPT, the amount of
             | autogenerated content on the web became significantly
             | bigger. Plus, the auto-generated content was easier to spot
             | before.
        
         | robot-wrangler wrote:
         | > Google made the search results worse here
         | 
         | Did you mean:
         | 
         | worse results near me
         | 
         | are worse results worth it
         | 
         | worse results net worth
         | 
         | best worse results
         | 
         | worse results reddit
        
           | d-lisp wrote:
           | search: Emacs                 Did you mean vim ?
           | (vice-versa)
        
             | ganzsz wrote:
             | Tbh, this sounds like a Google Easter egg.
        
               | mghackerlady wrote:
               | Because it is
        
         | zipy124 wrote:
         | Honestly the biggest failing is just SEO spam sites got too
         | good at defeating the algorithm. The amount of bloody listicles
         | or quora nonsense or backlink farming websties that come up in
         | search is crazy.
        
           | Nextgrid wrote:
           | This is bullshit the search engines want you to believe. It's
           | trivial to detect sites that "defeat" the algorithm; you
           | simply detect their incentives (ads/affiliate links) instead.
           | 
           | Problem is that no mainstream search engine will do it
           | because they happen to _also_ be in the ad business and
           | wouldn 't want to reduce their own revenue stream.
        
           | AznHisoka wrote:
           | For most commercial related terms, I suspect if you got rid
           | of all "spanmy" results you would be left with almost
           | nothing. No independent blogger is gonna write about the best
           | credit card with travel points.
        
             | eszed wrote:
             | I agree with your point, but you picked a poor example.
             | Have you _met_ any credit reward min-maxers?
        
             | baconbrand wrote:
             | I had a coworker who kept up a blog about random purchases
             | she'd made, where she would earn some money via affiliate
             | links. I thought it was horrendously boring and weird, and
             | the money made was basically pocket change, but she seemed
             | to enjoy it. You might be surprised, people write about all
             | sorts of things.
        
               | asdff wrote:
               | People used to do it early internet before affiliate
               | marketing really took it over. Certainly it was more
               | genuine and products were bemoaned for their compromises
               | in one dimension as much as praised for their performance
               | in another. Everything is a glowing review now and
               | comparisons are therefore meaningless.
        
             | strbean wrote:
             | Sites like Credit Karma / NerdWallet exist. While I think
             | they are rife with affiliate link nonsense and paid
             | promotion masquerading as advice, I'm also pretty sure they
             | have paid researchers and writers generating genuine
             | content. Not sure that quite falls into the bucket of SEO
             | blogspam.
        
               | asdff wrote:
               | It still counts because they would only ever recommend
               | affiliate partnered products.
        
           | watwut wrote:
           | Afaik they did not lost the fight. They stopped trying,
           | because it was good for short term earnings
        
             | masfuerte wrote:
             | Yes, this is true. It was revealed in Google emails
             | released during antitrust hearings. Google absolutely made
             | a deliberate decision to enshittify their search results
             | for short term gains.
             | 
             | Though maybe it's a long term gain. I know many normal
             | (i.e. non-IT) people who've noticed the poor search
             | results, yet they continue to use Google search.
        
           | duxup wrote:
           | I feel like google gave up the fight at some point. I think
           | HN had some good articles that indicated that.
        
             | strbean wrote:
             | Certainly seems that way if you observed the waves of
             | usability Google search underwent in the first 15 years.
             | There was several distinct cycles where the results were
             | great, then garbage, then great again. They would be
             | flooded with SEO spam, then they would tweak and penalize
             | the SEO spam heavily, then SEO would catch up.
             | 
             | The funny thing is that it seems like when they gave up it
             | wasn't because some new advancement in the arms race. It
             | was well before LLMs hit the scene. The SEO spam was still
             | incredibly obvious to a human reader. Really seems like
             | some data-driven approach demonstrated that surrendering on
             | this front led to increased ad revenue.
        
         | groundzeros2015 wrote:
         | Significant changes were made to Google and YouTube in 2016 and
         | 2017 in response to the US election. The changes provided more
         | editorial and reputation based filtering, over best content
         | matching.
        
         | benterix wrote:
         | > if I search on Google search for specific terms, I am not
         | interested in crap such as "others also searched for xyz" -
         | that is just ruining the UI with irrelevant information
         | 
         | You assume the aim here is for you to find relevant
         | information, not increase user retention time. (I just love the
         | corporate speak for making people's lives worse in various
         | ways.)
        
           | mcv wrote:
           | You finding relevant information used to be the aim.
           | Enshittification started when they let go of that aim.
        
         | 123malware321 wrote:
         | ML and AI killed it between 2011-2016 somewhere.
         | https://en.wikipedia.org/wiki/Dead_Internet_theory
        
         | juujian wrote:
         | The problem is that before Nov 30, 2022 we also had plenty of
         | human-generated slop bearing down on the web. SEO content
         | specifically.
        
         | ForHackernews wrote:
         | Goodhart's law applies to links, too. Google monetized them and
         | destroyed their value as a signal.
        
         | jollyllama wrote:
         | > The problem
         | 
         | That's a separate problem. The search algorithm applied on top
         | of the underlying content is a separate problem from the
         | quality or origin of the underlying content, in aggregate.
        
         | 0xEF wrote:
         | > I am not 100% certain why Google decided to ruin google
         | search.
         | 
         | Ask Prabhakar Raghavan. Bet he knows.
        
         | xnx wrote:
         | Counterpoint: The experience of quickly finding succinct
         | accurate responses to queries has never been better.
         | 
         | Years ago, I would consider a search "failed" if the page with
         | related information wasn't somewhere in the top 10. Now a
         | search is "failed" if the AI answer doesn't give me exactly
         | what I'm looking for directly.
        
         | codyb wrote:
         | I've been using DuckDuckGo for the last... decade or so. And it
         | still seems to return fairly relevant documentation towards the
         | top.
         | 
         | To be fair, that's most of what I use search for these days is
         | "<<Programming Language | Tool | Library | or whatever>>
         | <<keyword | function | package>>" then navigate to the
         | documentation, double check the versions align with what I'm
         | writing software in, read... move on.
         | 
         | Sometimes I also search for "movie showtimes nyc" or for a
         | specific venue or something.
         | 
         | So maybe my use cases are too specific to screw up, who knows.
         | If not, maybe DDG is worth a try.
        
           | geldedus wrote:
           | DuckDuckGo uses Bing search results.
        
       | EGreg wrote:
       | Can't we just append "before:2021-01-01" to Google?
       | 
       | I use this to find old news articles for instance.
        
       | theodric wrote:
       | This tool has no future. We have that in common with it, I fear.
       | 
       | What we really need to do is build an AI tool to filter out the
       | AI automatically. Anybody want to help me found this company?
        
       | keiferski wrote:
       | Projects like this remind me of a plot point in the Cyberpunk
       | 2077 game universe. The "first internet" got too infected with
       | dangerous AIs, so much so that a massive firewall needed to be
       | built, and a "new" internet was built that specifically kept out
       | the harmful AIs.
       | 
       | (Or something like that: it's been awhile since I played the
       | game, and I don't remember the specific details of the story.)
       | 
       | It makes me wonder if a new human-only internet will need to be
       | made at some point. It's mostly sci-fi speculation at this point,
       | and you'd really need to hash out the details, but I am thinking
       | of something like a meatspace-first network that continually
       | verifies your humanity in order for you to retain access. That
       | doesn't solve the copy-paste problem, or a thousand other ones,
       | but I'm just thinking out loud here.
        
         | jascha_eng wrote:
         | The problem really is that it is impossible to verify that the
         | content someone uploads came from their mind and not a computer
         | program. And at some point probably all content is at least
         | influenced by AI. The real issue is also not that I used
         | chatgpt to look up a synonym or asked a question before writing
         | an article, the problem is when I copy paste the content and
         | claim I wrote it.
        
           | Ylpertnodi wrote:
           | > The problem really is that it is impossible to verify that
           | the content someone uploads came from their mind and not a
           | computer program.
           | 
           | Er...digital id.
        
             | _heimdall wrote:
             | Ignoring the privacy and security issues for a moment, how
             | would having a digital ID prove that the blog post I put on
             | my site came only out of my own mind and I didn't use an
             | LLM for it?
        
           | visarga wrote:
           | > the problem is when I copy paste the content and claim I
           | wrote it
           | 
           | Why is this the problem and not the reverse - using AI
           | without adding anything original into the soup? I could
           | paraphrase an AI response in my own words and it will be no
           | better. But even if I used AI, if it writes my ideas, then it
           | would not be AI slop.
        
           | immibis wrote:
           | There doesn't need to be any difference in treatment between
           | AI slop and human slop. The point isn't to keep AI out - it's
           | to keep spam and slop out. It doesn't matter whether it's
           | produced by a being made of carbon or silicon.
           | 
           | If someone can consistently produce high-quality content with
           | AI assistance, so be it. Let them. Most don't, though.
        
             | jascha_eng wrote:
             | I think the main issue is that when content is hand written
             | you can be certain someone put at least the effort it takes
             | to write into it. And while some people write fast, I would
             | assume that at least means they have read their own writing
             | once.
             | 
             | AIslop you can produce faster than you're able to read it.
             | This makes it incredibly costly to filter out in
             | comparison. It just messes so much with the signal to noise
             | ratio on the web.
        
           | fao_ wrote:
           | > And at some point probably all content is at least
           | influenced by AI.
           | 
           | [citation needed]
           | 
           | (I see absolutely no reason why that should be the case)
        
             | asdff wrote:
             | The issue is most things being derivative along with AI now
             | representing an increasing share of "most things" from
             | which to derive from.
        
         | lukebuehler wrote:
         | Arguably this is already happening with much human-to-human
         | interactions moving to private groups on Signal, WhatsApp,
         | Telegram, etc.
        
         | SonnyTark wrote:
         | I share an opinion with Nick Bostrom, once a civilization
         | disrupting idea (like LLMs) is pulled out of the bag, there is
         | no putting it back. People in isolation will recreate it simply
         | because it's now possible. All we can do is adapt.
         | 
         | That being said, the idea of a new freer internet is reality..
         | Mastodon is a great example. I think private havens like
         | discord/matrix/telegram are an important step on the way.
        
           | ionwake wrote:
           | how does one keep ai out of private havens? thorough
           | verification? is that the future? private havens on
           | platforms?
        
             | embedding-shape wrote:
             | In person web of trust in order to join any private
             | community. It'll suck and be hard in the beginning, but
             | once you reach a threshold, it'll be OK. Ban entire trees
             | of users when you discover bots/puppets, to set an example.
        
               | visarga wrote:
               | So we expect either 1. people using AI and copy pasting
               | into the human-only network, or 2. other people claiming
               | your text sounds like AI and ostracizing you for no good
               | reason. It won't be a happy place - I know from anti-
               | generative AI forums.
        
               | immibis wrote:
               | Yep and then you deperson them
        
         | visarga wrote:
         | > a new human-only internet
         | 
         | Only if those humans don't take their leads from AI. If they
         | read AI and write, not much benefit.
        
         | pavel_lishin wrote:
         | There were also similar plot points mentioned in Peter Watts'
         | Starfish trilogy, and Neal Stephenson's Anathem.
        
       | Roritharr wrote:
       | I hope there's an uncensored version of the Internet Archive
       | somewhere, I wish I could look at my website ca. 2001, but I
       | think it got removed because of some fraudulent DMCA claim
       | somewhere in the early 2010s.
        
       | lxgr wrote:
       | > This is a search tool that will only return content created
       | before ChatGPT's first public release on November 30, 2022.
       | 
       | How does it do that? At least Google seems to take website
       | creation date metadata at face value.
        
       | erikpukinskis wrote:
       | Interesting concept. As a side benefit this would allow you to
       | make steady progress fighting SEO slop as well, since there can
       | be no arms race if you are ignoring new content.
       | 
       | You could even add options for later cutoffs... for example, you
       | could use today's AIs to detect yesterday's AI slop.
        
       | softwaredoug wrote:
       | The other day I was researching with ChatGPT.
       | 
       | * ChatGPT hallucinated an answer
       | 
       | * ChatGPT put it in my memory, so it persisted between
       | conversations
       | 
       | * When asked for a citation, ChatGPT found 2 AI created articles
       | to back itself up
       | 
       | It took a while, but I eventually found human written
       | documentation from the organization that created the technical
       | thingy I was investigating.
       | 
       | This happens A LOT for topics on the edge of knowledge easily
       | found on the Web. Where you have to do true research, evaluate
       | sources, and make good decisions on what you trust.
        
         | fireflash38 wrote:
         | AI reminds me of combing through stackoverflow answers. The
         | first one might work... Or it might not. Try again, find a
         | different SO problem and answer. Maybe third times the charm...
         | 
         | Except it's all via the chat bot and it isn't as easy to get it
         | to move off of a broken solution.
        
         | visarga wrote:
         | Simple solution - run the same query on 3 different LLMs with
         | different search integrations, if they concur chances of
         | hallucination are low.
        
           | baconbrand wrote:
           | Or you could just... not use LLMs
        
           | asdff wrote:
           | Or they've converged on the same bullshit
        
       | vertnerd wrote:
       | Just the other evening, as my family argued about whether some
       | fact was or was not fake, I detached from the conversation and
       | began fantasizing about whether it was still possible to buy a
       | paper encyclopedia.
        
       | stopthe wrote:
       | In hindsight, that would've been a real utility use case for
       | NFTs. A decentralized cryptographic prove that some content
       | existed in a particular form at a particular moment.
        
       | Barathkanna wrote:
       | I didn't know "eccentric engineering" was even a term before
       | reading this. It's fascinating how much creativity went into
       | solving problems before large models existed. There's something
       | refreshing about seeing humans brute force the weird edges of a
       | system instead of outsourcing everything to an LLM.
       | 
       | It also makes me wonder how future kids will see this era. Maybe
       | it will look the same way early mechanical computers look to us.
       | A short period where people had to be unusually inquisitive just
       | to make things work.
        
         | hombre_fatal wrote:
         | Maybe like how I view my dad and the punchcard era: cool and
         | endearing that he went through that, but thankful that I don't
         | have to.
        
       | dpedu wrote:
       | I mean I get it, but it seems a bit silly. What's next - an image
       | search engine that only returns images created before photoshop?
        
       | josephjrobison wrote:
       | The real gold is content created before the internet!
        
       | javaskrrt wrote:
       | This is such a great idea
        
       | audiala wrote:
       | It doesn't really work. I tried my website and it shows up, while
       | definitely being built after 2023. There is a mistake in the
       | metadata of the page that shows it as from 2011.
       | 
       | https://audiala.com/changelog
        
       | potato-peeler wrote:
       | You don't need an extension to do this. Simply add a "before:"
       | search filter to your search query, eg -
       | https://www.google.com/search?q=Happiness+before%3A2022
        
       | dwa3592 wrote:
       | so it's a filter by date and you chose the chatgpt's public
       | release?
        
       | Bad_Initialism wrote:
       | How about a search engine that only returns what you searched
       | for, and not a million other unrelated things that it hopes you
       | might like to buy?
       | 
       | This goes for you, too, website search.
        
       | diavarlyani wrote:
       | We now need an extension to hide 3 years of the internet because
       | it was written by robots. This timeline is undefeated.
        
       | throwawayk7h wrote:
       | I noticed AI-generated slop taking over google search results
       | well before ChatGPT. So I don't agree with the premise on this
       | site that you can be "you can be sure that it was written or
       | produced by the human hand."
        
       | 1vuio0pswjnm7 wrote:
       | "This browser extension uses the Google search API to only return
       | content published before Nov 30th, 2022 so you can be sure that
       | it was written or produced by the human hand."
        
       | micromacrofoot wrote:
       | What kind of heuristics does it use to determine age? a lot of
       | content on Google actually backdates for some reason...
       | presumably some sort of SEO scam?
        
       | stocksinsmocks wrote:
       | I really thought this was going to be the Dewey Decimal system.
       | Exclude sources from this century. It's the only way to be sure.
        
       | ris wrote:
       | For a while I've been saying it's a pity we hadn't been regularly
       | trusted-timestamping everything before that point as a matter of
       | course.
        
       | 2OEH8eoCRo0 wrote:
       | low-background information
        
       ___________________________________________________________________
       (page generated 2025-12-01 23:01 UTC)