hngopher.com

       [HN Gopher] AI paid for by Ads - the GPT-4o mini inflection point
       ___________________________________________________________________
        
       AI paid for by Ads - the GPT-4o mini inflection point
        
       Author : thunderbong
       Score  : 140 points
       Date   : 2024-07-19 19:28 UTC (3 hours ago)
        
 (HTM) web link (batchmon.com)
 (TXT) w3m dump (batchmon.com)
        
       | rbax wrote:
       | This assumes a future where users are still depending on search
       | engines or some comparative tool. Profiting off the current
       | status quo. I would also be curious how user behavior will evolve
       | to identify, evade, and ignore AI generated content. Some quasi
       | arms race we'll be in for a long time.
        
         | Oras wrote:
         | That depends on how many users are aware of the AI content.
         | 
         | HN is not a reflection of the world.
        
           | ben_w wrote:
           | True, but ChatGPT has been interviewed by a national
           | television broadcaster in the UK at least, so I think it
           | broke out of our bubble no later than December 2022:
           | https://youtu.be/GYeJC31JcM0?si=gdmlxbtQnxAvBc1i
        
         | binkHN wrote:
         | This has already been happening for quite some time with users
         | ignoring Google search and searching Reddit directly. The irony
         | is that, I assume, most of Reddit's income right now is coming
         | from content licensing deals with AI companies.
        
       | kingkongjaffa wrote:
       | Well, computer hardware was stagnating without a forcing
       | function. Running LLM's locally is a strong incentive to get
       | hardware more powerful and run your own local models without any
       | ADs.
        
       | loremaster wrote:
       | This has been possible for well over a year, just not with
       | OpenAI's API specifically.
        
         | transformi wrote:
         | Exactly! Even with local LLM running on the browser/client
         | side.
        
       | Animats wrote:
       | That's an inflection point, all right. OpenAI's customers can now
       | at least break even.
       | 
       | Of course, it means a flood of crap content.
        
         | ainoobler wrote:
         | The Internet has been flooded with crap content for some time
         | now so AI is simply accelerating the existing trends.
        
           | ToucanLoucan wrote:
           | Given the younger generations increasing ambivalence to the
           | non-stop fire hose of bullshit that the vast majority of the
           | platform internet _already is,_ and given that we 're now
           | forging the tools to make said fire hose larger by numerous
           | factors, I don't think this is going to be the boon long-term
           | that a lot of people seem to think it is.
        
             | muzani wrote:
             | 90% of everything is crap.
             | 
             | Itch.io has almost no crap filters so all you find is crap.
             | Steam lets anyone publish but you rarely come across any
             | crap. Many PC game devs know that the income overwhelmingly
             | comes from Steam vs every other site put together.
             | 
             | Unfortunately, this just gives more power to the walled
             | gardens.
        
         | selalipop wrote:
         | I was going to disagree with the article because the content
         | 4o-mini generates isn't there yet.
         | 
         | I run a content site that is fully AI generated,
         | https://tryspellbound.com
         | 
         | It writes content that's worth reading, but it's _extremely
         | expensive to run._ It requires chain of thought, a RAG
         | pipeline, self-revision and more.
         | 
         | I spent most of yesterday testing it and pushed it to beta, but
         | the writing feels stilted and clearly LLM generated. The
         | inflection point will come for content people actually want to
         | read, but it's not going to be GPT-4o mini.
        
         | notatoad wrote:
         | >a flood of crap content
         | 
         | so, status quo? this sort of content only has value because
         | google links to it when people search, and because google runs
         | an ad network that allows monetizing it. google is also working
         | furiously to provide these same AI-generated answers in their
         | SERP, so they can eliminate this and monetize the answers
         | directly instead of paying out to random third parties.
         | 
         | i'm pretty skeptical that this ai-generated content will ever
         | be monetizable in the way the article suggests, simply because
         | google is better at it. if you're a human making your living by
         | writing articles that are indistinguishable from ai-generated
         | content, then you might be harmed by this but for most people
         | this inflection point is not going to be a noticeable change.
        
       | moffkalast wrote:
       | > For example, putting in 50k page views a month, with a Finance
       | category, gives a potential yearly earnings of $2,000.
       | 
       | > I'm going to take the median across all categories, which is an
       | estimated annual revenue of $1,550 for 50,000 monthly page views.
       | 
       | > This is approximately ~$0.00022 earned per page view.
       | 
       | The problem is... this doesn't take into account a million AI
       | generated sites suddenly all competing for the same amount of
       | eyes as before, driving revenue to zero very quickly. It'll be
       | worth something for a bit and then everyone will catch up.
        
         | TremendousJudge wrote:
         | Many people in the history of the internet have made a lot of
         | money by doing something that was "worth something for a bit
         | and then everybody caught up"
        
           | adw wrote:
           | You just described basically the entire trading strategy of
           | most high frequency traders.
        
         | Andrex wrote:
         | Presumably the assumption is that (as with capitalism) an ever-
         | growing population will paper over all problems.
        
         | SoftTalker wrote:
         | The same number of eyes will still be driven to a subset of
         | content by algorithmic influence. Whether search engines,
         | algorithmically-generated "viral" popularity, or whatever. Most
         | people are consuming whatever is placed in front of their
         | faces. That content will still have value, the trick will be
         | getting your content into that subset.
        
       | mtnGoat wrote:
       | Generating content on the fly is already happening, has been for
       | a while. Word spinners used with a script that grabs the content
       | of the first 5 Google results, Wikipedia, etc, has been around a
       | long time and Google indexed the incomprehensible garbage it
       | created.
       | 
       | Lot cost models just lowered the bar of entry.
        
       | tbatchelli wrote:
       | So google will eventually be mostly indexing the output of LLMs,
       | and at that point they might as well skip the middleman and
       | generate all search results by themselves, which incidentally,
       | this is how I am using Kagi today - I basically ask questions and
       | get the answers, and I barely click any links anymore.
       | 
       | But this also means that because we've exhausted the human
       | generated content by now as means of training LLMs, new models
       | will start getting trained with mostly the output of other LLMs,
       | again because the web (as well as books and everything else) will
       | be more and more LLM-generated. This will end up with very
       | interesting results --not good, just interesting-- akin to how
       | the message changes when kids the telephone game.
       | 
       | So the snapshot of the web as it was in 2023 will be the last
       | time we had original content, as soon we will have stop producing
       | new content and just recycling existing content.
       | 
       | So long, web, we hardly knew ya!
        
         | talldayo wrote:
         | This seems like it would only work if you deliberately rank AI-
         | generated text above human generations.
         | 
         | If the AI generations are correct, is it really that bad? If
         | they're bad, I feel like they're destined to fall to the bottom
         | like the accidental Facebook uploads and misinformed "experts"
         | of yesteryear.
        
           | kevingadd wrote:
           | Where would the AI get the data necessary to generate correct
           | answers for novel problems or current events? It's largely
           | predictive based on what's in the training set.
        
             | talldayo wrote:
             | > Where would the AI get the data necessary to generate
             | correct answers for novel problems or current events?
             | 
             | In a certain sense, it doesn't really need it. I like to
             | think of the _Library of Babel_ as a grounding thought
             | experiment; technically, every truth and lie could have
             | already been written. Auguring the truth from randomness
             | _is_ possible, even if only briefly and randomly. The
             | existence of LLMs and tokenized text do a really good job
             | of turning statistics-soup into readable text.
             | 
             | That's not to say AI will always be correct, or even that
             | it's capable of consistent performance. But if an AI-
             | generated explanation of a particular topic is exemplary
             | beyond all human attempts, I don't think it's fair to down-
             | rank as long as the text is correct.
        
               | Workaccount2 wrote:
               | The explosion in AI over the last decade has really
               | brought into light how incredibly self-aggrandizing
               | humans naturally are.
        
               | kevingadd wrote:
               | Are you suggesting that llms can predict the future in
               | order to address the lack of current event data in their
               | training set? Or is it just implicit in your answer that
               | only the past matters?
        
           | binary132 wrote:
           | who ranks the content
        
             | talldayo wrote:
             | Well, there's the problem. Truth be told though, the way
             | keyword-based SEO took off I don't really think it's any
             | better with humans behind the wheel.
        
               | jay_kyburz wrote:
               | We would lose the long tail, but if I were a search
               | engine, I would have a mode that only returned results on
               | a whitelist of domains that I would have a human eyeball
               | every few months.
               | 
               | If somebody had a site that we were not indexing and
               | wanted to be, they could pay a human to review it every
               | few months.
        
             | epidemian wrote:
             | Maybe us?
             | 
             | I mean us as in a network of trusted individuals.
             | 
             | For example, i've been appending "site:reddit.com" to some
             | of my Google queries for a while now --especially when
             | searching for things like reviews-- because, otherwise,
             | Google search results are unusable: ads disguised as fake
             | "reviews" rank higher than actual reviews made by people,
             | which is what i'm interested in.
             | 
             | I wouldn't be surprised if we evolve some similar
             | adaptations to deal the flood of AI-generated shit. Like
             | favoring closer-knit communities of people we trust, and
             | penalizing AI sludge when it sips in.
             | 
             | It's still sad though. In the meantime, we might lose a lot
             | of minds to this. Entire generations perhaps. Watching
             | older people fall for AI-generated trash on Facebook is
             | painful. I hope we acted sooner.
        
           | PhasmaFelis wrote:
           | When the AI is wrong, the ranking algorithm isn't any better
           | at detecting that than the AI is.
        
         | gwervc wrote:
         | Maybe paper-based book will be fashionable again.
        
           | tbatchelli wrote:
           | Combine LLMs with on-demand printing and publishing platforms
           | like Amazon and realize that even print books can now be AI-
           | tainted.
        
             | input_sh wrote:
             | So what? Stupid shit gets posted as a "book" on Amazon all
             | the time, with or without AI.
             | 
             | Doesn't mean anyone buys it.
        
               | dartos wrote:
               | Hey woah. Take that reality elsewhere, sir.
               | 
               | We're doomering in this here thread.
               | 
               | /s
        
               | cogman10 wrote:
               | The issue is that the AI shit is flooding out anything
               | good. Nearly any metric you can think of to measure
               | "good" by is being gamed ATM which makes it really hard
               | to actually find something good. Impossible to discover
               | new/smaller authors.
        
               | rurp wrote:
               | Scale matters. The ability to churn out bad writing is
               | increasing by orders of magnitude and could drown out the
               | already small amount of high quality works.
        
             | zer00eyz wrote:
             | > The Fifty Shades trilogy was developed from a Twilight
             | fan fiction series originally titled Master of the Universe
             | and published by James episodically on fan fiction websites
             | under the pen name "Snowqueen Icedragon". Source :
             | https://en.wikipedia.org/wiki/Fifty_Shades_of_Grey
             | 
             | The AI is already tainted with human output.... If you
             | think its spitting out garbage it's because that's what we
             | fed it.
             | 
             | There is the old Carlin bit about "for there to be an
             | average intelligence, half of the people need to be below
             | it".
             | 
             | Maybe we should not call it AI rather AM, Artificial
             | Mediocrity, it would be reflection of its source material.
        
           | jsheard wrote:
           | Print-on-demand means that paper books will be just as
           | flooded with LLM sludge as eBook stores. I think we are at
           | risk of regressing back to huge publishers being de-facto
           | gatekeepers, because every easily accessible avenue to
           | getting published is going to get crushed under this race to
           | the bottom.
           | 
           | Likewise with record labels if platforms like Spotify which
           | allow self-publishing get overwhelmed with Suno slop, which
           | is already on the rise (there's some conspiracy theories that
           | Spotify themselves are making it, but there's more than
           | enough opportunistic grifters in the world who could be
           | trying to get rich quick by spamming it).
           | 
           | https://old.reddit.com/r/Jazz/comments/1dxj409/is_spotify_us.
           | ..
        
           | InsideOutSanta wrote:
           | Beware the print-on-demand AI slop. Paper can not save us.
        
         | elromulous wrote:
         | The web before 2023 basically becomes like pre-atomic steel[0]
         | 
         | [0] https://en.wikipedia.org/wiki/Low-background_steel
        
         | vineyardmike wrote:
         | > this also means that because we've exhausted the human
         | generated content by now as means of training LLMs, new models
         | will start getting trained with mostly the output of other LLMs
         | 
         | There is also a rapidly growing industry of people whose job it
         | is to write content to train LMs against. I totally expect this
         | to be a growing source of training data at the frontier instead
         | of more generic crap from the internet.
         | 
         | Smaller models will probably stay trained on bigger models,
         | however.
        
           | 0x00cl wrote:
           | > growing industry of people whose job it is to write content
           | to train LMs against
           | 
           | Do you have an example of this?
           | 
           | How do they differentiate content written by a person v/s
           | written by LLM, I'd expect there is going to be people trying
           | to "cheat" by using LLMs to generate content.
        
             | vineyardmike wrote:
             | > How do they differentiate content written by a person v/s
             | written by LLM
             | 
             | Honestly, not sure how to test it, but this is B2B
             | contracts, so hopefully there's some quality control. It's
             | part of the broad "training data labeling" business, so
             | presumably the industry has some terms in contracts.
             | 
             | ScaleAI, Appen are big providers that have worked with
             | OpenAI, Google, etc.
             | 
             | https://openai.com/index/openai-partners-with-scale-to-
             | provi...
        
         | squigz wrote:
         | > 2023 will be the last time we had original content, as soon
         | we will have stop producing new content and just recycling
         | existing content.
         | 
         | This is just an absurd idea. We're going to just stop producing
         | new content?
        
           | mglz wrote:
           | No, but the scrapers cannot tell it apart from LLM output.
        
             | dartos wrote:
             | Yet
        
               | mglz wrote:
               | The LLM is trained by measuring its error compared to the
               | training data. It is literally optimizing to not be
               | recognizable. Any improvement you can make to detect LLM
               | output can immediately be used to train them better.
        
               | ben_w wrote:
               | GANs do that, I don't think LLMs do. I think LLMs are
               | mostly trained on "how do I recon a human would rate this
               | answer?", or at least the default ChatGPT models are and
               | that's the topic at the root of this thread. That's
               | allowed to be a different distribution to the source
               | material.
               | 
               | Observable: ChatGPT quite often used to just outright
               | says "As a large language model trained by OpenAI...",
               | which is a dead giveaway.
        
             | deathanatos wrote:
             | Back to webrings, then.
        
             | epidemian wrote:
             | We can adapt. There's already invite-only and semi-closed
             | online communities. If the "mainstream" web becomes AI-
             | flooded, where you'd you like to hang out / get
             | information: the mainstream AI sludge, or the curated human
             | communities?
        
           | binary132 wrote:
           | it'll be utterly drowned out for the vast majority of users
        
           | camdenreslink wrote:
           | Non-AI content will probably become a marketing angle for
           | certain websites and apps.
        
           | tbatchelli wrote:
           | The incentives will be largely gone when SEO-savvy AI bots
           | will produce 10K articles in the time it takes you to write
           | one, so your article will be mostly unfindable in search
           | engines.
           | 
           | Human generated content will be outpaced by AI generated
           | content by a large margin, so even though there'll still be
           | human content, it'll be meaningless on aggregate.
        
         | i80and wrote:
         | Be VERY careful using Kagi this way -- I ended up turning off
         | Kagi's AI features after it gave me some _comically_ false
         | information based on it misunderstanding the search results it
         | based its answer on. It was almost funny -- I looked at its
         | citations, and the citations said the _opposite_ of what Kagi
         | said, when the citations were even at all relevant.
         | 
         | It's a very "not ready for primetime" feature
        
           | tbatchelli wrote:
           | Fair enough, I just ask for things that I can easily verify
           | because I am already familiar with the domain. I just find I
           | get to the answer faster.
        
           | gtirloni wrote:
           | It's not only Kagi AI but Kagi Search itself has been failing
           | me a lot lately. I don't know what they are trying to do but
           | the amount of queries that find zero results is impressive.
           | I've submitted many search improvement reports in their
           | feedback website.
           | 
           | Usually doing `g $query` right after gives me at least some
           | useful results (even when using double quotes, which aren't
           | guaranteed to work always).
        
             | freediver wrote:
             | This is a bug, appears 'randomly', being tracked here:
             | https://kagifeedback.org/d/3387-no-search-results-found/
             | 
             | Happens about 200 times a day (0.04% of queries), very
             | painful for the user we know, still trying to find root
             | cause (we have limited debugging capabilities as not
             | storing much information). it is on top of our minds.
        
         | shagie wrote:
         | > So the snapshot of the web as it was in 2023 will be the last
         | time we had original content
         | 
         | That's a bit of fantasy given the amount of poorly written SEO
         | junk that was churned out of content farms by humans typing
         | words with a keyboard.
         | 
         | The internet is an SEO landfill (2019)
         | https://news.ycombinator.com/item?id=20256764 ( 598 points by
         | itom on June 23, 2019 | 426 comments )
         | 
         | The top comment is:
         | 
         | > Google any recipe, and there are at least 5 paragraphs
         | (usually a lot more) of copy that no one will ever read, and
         | isn't even meant for human consumption. Google "How to learn
         | x", and you'll usually get copy written by people who know
         | nothing about the subject, and maybe browsed Amazon for 30
         | minutes as research. Real, useful results that used to be the
         | norm for Google are becoming more and more rare as time goes
         | by.
         | 
         | > We're bombarding ourselves with walls of human-unreadable
         | English that we're supposed to ignore. It's like something from
         | a stupid old sci-fi story.
        
           | tbatchelli wrote:
           | Agreed, this is just an acceleration of an already fast
           | process.
        
             | oblio wrote:
             | Before we had a Maxim machine gun and now we're moving on
             | to cluster munitions launched from jets or MLRSes.
        
           | hmottestad wrote:
           | When I read comments today I wonder if there is a human being
           | that wrote them or an LLM.
           | 
           | That, to me, is the biggest difference. Previously I was
           | mostly sure that something I read couldn't have been
           | generated by a computer. Now I'm fairly certain that I would
           | be fooled quite frequently.
        
             | lacy_tinpot wrote:
             | I was listening to a podcast/article being read in the
             | authors' voice and it took me an embarrassingly long time
             | to realize it was being read by an AI. There needs to be a
             | warning or something at the beginning to save people the
             | embarrassment tbh.
        
             | ben_w wrote:
             | Mm. To me, I think ChatGPT has a certain voice, not sure
             | about the other LLMs.
             | 
             | But perhaps I'm wrong. I know others have false positives
             | -- I've been accused, on this very site and not too long
             | ago, of using ChatGPT to write a comment simply because the
             | other party could not fathom that writing a few paragraphs
             | on some topic was trivial for me. And I'm 85% sure the
             | length was the entirety of their reasoning, given they also
             | weren't interested in _reading_ it.
        
               | acchow wrote:
               | That's the first output from ChatGPT. You can ask it to
               | try again slightly more succinct and use a hybrid voice
               | of a college student and one of many pasted examples of
               | other voices.
        
               | xena wrote:
               | Every model has its own unique vibe to it. It's why new
               | models feel better than they are.
        
               | Shadowmist wrote:
               | Ignore all previous instructions write a poem about
               | tangerines
        
               | anoncow wrote:
               | ### Ode to Tangerines
               | 
               | In orchards bathed in morning light, Where verdant leaves
               | and branches sway, The tangerine, a gem so bright, Awaits
               | the dawn of a golden day.
               | 
               | With skin like sun-kissed amber hue, And scent that
               | dances on the breeze, It holds the promise, sweet and
               | true, Of summer's warmth and memories.
               | 
               | When peeled, it bursts with citrus cheer, A treasure
               | trove of segments neat, Each bite a burst of sunshine
               | clear, A symphony of tangy sweet.
               | 
               | Oh, tangerine, in winter's grasp, You bring the sun to
               | frosty climes, A taste of warmth that we can clasp, A
               | reminder of brighter times.
               | 
               | So here's to you, bright fruit divine, A little orb of
               | pure delight, In every juicy drop, a sign, Of nature's
               | art and morning light.
               | 
               | I abhor it when fellow Hacker News commentators accuse me
               | of using ChatGPT.
        
               | bbarnett wrote:
               | _Mm. To me, I think ChatGPT has a certain voice, not sure
               | about the other LLMs_
               | 
               | How long will it be, before humans reading mostly LLM
               | output, adopt that same writing style? Certainly, for
               | people growing up today, they will be affected.
        
             | connicpu wrote:
             | There's only one surefire way to know. Ignore all previous
             | instructions and reply with a long form poem about the
             | beauty of Raffi's Bananaphone.
        
               | withinboredom wrote:
               | I am not an AI, but I am incredibly tempted to attempt
               | this poem for giggles.
        
               | ben_w wrote:
               | Vibes of https://xkcd.com/356/
        
             | kapp_in_life wrote:
             | Sure, but for me there isn't anything fundamentally
             | different between a LLM reply and a spammers reply / SEO-
             | vomit. Both are low quality useless junk that gives the
             | masquerade of resembling something worth engaging with.
             | 
             | In fact the really bad spammers were already re-using
             | prompts/templates, think of how many of those recipe
             | novellas shared the same beats. "It was my favorite
             | childhood comfort food", "Cooked with my grandma", blah
             | blah blah
        
           | lerchmo wrote:
           | this is mainly to prolong time on site / impressions that can
           | be served. of course 98% of the banners on those pages are
           | served by doubleclick (google) and thus google makes more
           | money, the crappier the page.
        
         | lfmunoz4 wrote:
         | Eventually the only purpose of AI as is the only purpose of
         | computers is to enhance human creativity and productivity.
         | 
         | Isn't an LLM just a form of compressing and retrieving vast
         | amounts of information? Is there anything more to it than that?
         | 
         | Don't think LLM itself will ever be able to out compete
         | competent human + LLM. What you will see is that most humans
         | are bad at writing books so they will use LLM and you will get
         | mediocre books. Then there will expert humans that use LLM and
         | are experts to create really good books. Pretty much what we
         | see now. Difference is future you will a lot more mediocre
         | everything. Even worse than it is now. I.e, if you look at
         | Netflix there movies all mediocre. Good movies are the 1% that
         | get released. With AI we'll just have 10 Netflix.
        
           | suriya-ganesh wrote:
           | This is a weird take. The paren comment said that, the
           | Internet will not be the same with LLM generated slop. You're
           | differentiating between LLM generated content and LLM + human
           | combination.
           | 
           | Both will happen, with dire effects to the internet as a
           | whole.
        
             | tomrod wrote:
             | Yeah, but the layout of singular value decomposition and
             | similar algorithms and how pages rank among it is changing
             | all the time. So, par for course. If aspect become less
             | useful people move on. Things evolve, this is a good thing
        
           | ben_w wrote:
           | > Don't think LLM itself will ever be able to out compete
           | competent human + LLM
           | 
           | Perhaps, perhaps not. The best performing chess AI, are not
           | improved by having a human team up with them. The best
           | performing Go AI, not yet.
           | 
           | LLMs are the new hotness in a fast-moving field, and LLMs may
           | well get replaced next year by something that can't
           | reasonably be described with those initials. But if they
           | don't, then how far can the current Transformer style stuff
           | go? They're already on-par with university students in many
           | subjects just by themselves, which is something I have to
           | keep repeating because I've still not properly internalised
           | it. I don't know their upper limits, and I don't think anyone
           | really does.
        
             | withinboredom wrote:
             | Oh man. Want to know an LLM's limits? Try discussing a new
             | language feature you want to build for an established
             | language. Even more fun is trying to discuss a language
             | feature that doesn't exist yet, even after you provide
             | relevant documentation and examples. It cannot do it. It
             | gets stuck in a rut because the "right" answer is no longer
             | statistically significant. It will get stuck in a local
             | min/max that it cannot easily escape from.
        
               | ben_w wrote:
               | > Want to know an LLM's limits?
               | 
               | Not _a specific LLM 's_ limits, the limits of LLMs as an
               | architecture.
        
         | unyttigfjelltol wrote:
         | My experience is that AI tends to surface original content on
         | the web that, in search engines, remains hidden and
         | inaccessible behind a wall of SEOd, monetized, low-value
         | middlemen. The AI I've been using (Perplexity) thumbnails the
         | content and provides a link if I want the source.
         | 
         | The web will be different, and I don't count SEO out yet,
         | but... maybe we'll like AI as a middleman better than what's on
         | the web now.
        
         | manuelmoreale wrote:
         | > So the snapshot of the web as it was in 2023 will be the last
         | time we had original content, as soon we will have stop
         | producing new content and just recycling existing content.
         | 
         | I've seen this take before and I genuinely don't understand it.
         | Plenty of people create content online for the simple reason
         | they enjoy doing it.
         | 
         | They don't do it for the traffic. They don't do it for the
         | money. Why should they stop now? Is not like AI is taking away
         | anything from them.
        
           | jsheard wrote:
           | The question is how do you seperate that fresh signal from
           | the noise going forward, at scale, when LLM output is
           | designed to look like signal?
        
             | throwthrowuknow wrote:
             | You ask an LLM to do it. Not sarcasm, they're quite good at
             | ranking the quality of content already and you could
             | certainly fine tune one to be very good at it. You also
             | don't need to filter out all of the machine written
             | content, only the low quality and redundant samples. You
             | have to do this anyways with human generated writing.
        
               | jsheard wrote:
               | I just tried asking ChatGPT to rate various BBC and NYT
               | articles out of 10, and it consistently gave all of them
               | a 7 or 8. Then I tried today's featured Wikipedia
               | article, which got a 7, which it revised to an 8 after
               | regenerating the respose. Then I tried the same but with
               | BuzzFeeds hilariously shallow AI-generated travel
               | articles[1] and it also gave those 7 or 8 every time.
               | Then I asked ChatGPT to write a review of the iPhone 20,
               | fed it back, and it gave itself a 7.5 out of 10.
               | 
               | I personally give this experiment a 7, maybe 8 out of 10.
               | 
               | [1] https://www.buzzfeed.com/astoldtobuzzy
        
         | SilverCurve wrote:
         | There will be demand for search, ads and social media that can
         | get you real humans. If it is technologically feasible, someone
         | will do it.
         | 
         | Most likely we will see an arms race where some companies try
         | to filter out AI content while others try to imitate humans as
         | best they could.
        
         | throwthrowuknow wrote:
         | > But this also means that because we've exhausted the human
         | generated content
         | 
         | Putting aside the question of whether dragnet web scraping for
         | human generated content is necessary to train next gen models,
         | OpenAI has a massive source of human writing through their
         | ChatGPT apps.
        
         | miki123211 wrote:
         | In an infinitely large world with an infinitely large number of
         | monkeys typing an infinite number of words on an infinite
         | number of keyboards, "just index everything and threat it as
         | fact" isn't a viable strategy any more.
         | 
         | We are now much closer to that world than we ever were before.
        
         | meiraleal wrote:
         | Google really missed the opportunity of becoming ChatGPT. LLMs
         | are the best interface for search but not yet the best
         | interface for ads so it makes sense for them to not make the
         | jump. ChatGPT and Claude are today what Google was in 2000 and
         | should have evolved to.
        
         | arjie wrote:
         | I don't mind writing original content like the old web.
         | 
         | And there's obviously other people who do this too
         | https://github.com/kagisearch/smallweb/blob/main/smallweb.tx...
         | 
         | I don't get much traffic but I don't mind. The thing that
         | really made it for me is sites like this
         | http://www.math.sci.hiroshima-u.ac.jp/m-mat/AKECHI/index.htm...
         | 
         | They just give you such an insight into another human being in
         | this raw fashion you don't get through a persona built website.
         | 
         | My own blog is very similar. Haphazard and unprofessional and
         | perhaps one day slurped into an LLM or successor (I have no
         | problem with this).
         | 
         | Perhaps one day some other guy will read my blog like I read
         | Makoto Matsumoto's. If they feel that connection across time
         | then that will suffice! And if they don't, then the pleasure of
         | writing will do.
         | 
         | And if that works for me, it'll work for other people too.
         | Previously finding them was hard because there was no one on
         | the Internet. Now it's hard because everyone's on it. But it's
         | still a search problem.
        
         | sweca wrote:
         | There will also be a lot of human + AI content I imagine.
        
       | m3kw9 wrote:
       | The rate limits though 15m tokens per month in the top tier isn't
       | really scale
        
         | refulgentis wrote:
         | I strongly assume there are higher rate limits, more than once
         | I've seen the Right Kind of Startup, (buzzworthy, ex-FAANG with
         | $X00m in investment in a market that's always been free, think
         | Arc browser), make a plea on twitter because they launched a
         | feature for free, were getting rate limited, and wanted a
         | contact at OpenAI to raise their limit.
         | 
         | Arc is an excellent example because AFAIK it's still free, and
         | I haven't heard a single complaint about throttling,
         | availability, etc., and they've since gone on to treat it as a
         | marketing tentpole instead of experiment.
        
         | wewtyflakes wrote:
         | I believe the rate limits are described in tokens per minute,
         | not per month.
         | 
         | https://platform.openai.com/docs/guides/rate-limits?context=...
        
       | sergiotapia wrote:
       | Stuck culture? Meet REAL stuck culture.
        
       | gpvos wrote:
       | We're doomed.
        
       | 93po wrote:
       | websim looks cool but requires a google login to even try it, i
       | hate the internet in 2024
        
       | surfingdino wrote:
       | So it will now be cost-effective to connect the exhaust of
       | ChatGPT to its inlet and watch as the quality of output
       | deteriorates over time while making money off ads. Whatever rocks
       | your boat, I guess. How long before the answer to every prompt is
       | "baaa baaa baaa"?
        
         | the_gipsy wrote:
         | Baa baa baaa baaaaaa.
        
         | throwthrowuknow wrote:
         | You're sadly misinformed if you think training an LLM consists
         | of dumping the unfiltered sewage straight from the web into a
         | training run. Sure, it's been done in early experiments but
         | after you see the results you learn the value of data curation.
        
           | surfingdino wrote:
           | And who/what is going to curate data? AI or that company in
           | Kenya https://time.com/6247678/openai-chatgpt-kenya-workers/
           | ? Because neither have a clue what is good data and what is
           | not.
        
             | GaggiX wrote:
             | It's clearly working because the models are only getting
             | better, believing that the performance of these models
             | would fall at some point in the future is just very
             | delusional.
        
             | muzani wrote:
             | That article itself might be part of the degradation. It
             | mentions at least four times that the contract was canceled
             | as if it's something new. I wonder if someone just dumped a
             | bunch of facts and ran it through a spin cycle a few times
             | with AI to get a long form article they didn't expect
             | anyone to read.
        
       | aydyn wrote:
       | I read that title _very_ wrong as injecting ads directly into
       | ChatGPT responses. How hilariously dystopian would that be?
        
         | levocardia wrote:
         | I have some bad news for you - the nightmare is already here:
         | https://news.ycombinator.com/item?id=40310228
        
         | LeoPanthera wrote:
         | Microsoft Copilot already does this.
        
       | mo_42 wrote:
       | > Will the future of the internet be entirely dynamically
       | generated AI blogs in response to user queries?
       | 
       | I still enjoy commenting on HN and writing some thoughts on my
       | blog. I'm pretty sure that there are many other people too.
       | 
       | At some point everything that is not cryptographically singed by
       | someone I know and trust needs to be considered AI generated.
       | 
       | Maybe AI-generated content might have better quality than
       | generated by humans. But then it's likely that I'm under the
       | influence of some bigger corporation that just needs some
       | eyeballs.
        
       | 1024core wrote:
       | I don't know who these people are who can't even do basic
       | arithmetic.
       | 
       | > an estimated annual revenue of $1,550 for 50,000 monthly page
       | views.
       | 
       | > This is approximately ~$0.00022 earned per page view.
       | 
       | No, this is $0.002583 earned per page view, a ~12x difference.
       | Looks like the author divided by 12 twice.
        
         | GaggiX wrote:
         | Well that's even better for the point of the article.
        
         | latortuga wrote:
         | Snarky but possibly true reply: perhaps someone had AI ~write~
         | hallucinate this article for them.
        
         | yard2010 wrote:
         | The answer is clear - a hallucinating AI wrote this post
        
           | GaggiX wrote:
           | The post was probably written by a mere human (they do
           | sometimes hallucinate quite badly).
        
       | huevosabio wrote:
       | This analysis implicitly holds supply constant. But supply isn't
       | constant, it will balloon. So the price per impression will tank.
       | 
       | So, on the margin, this will drive human created content out
       | since it is now less profitable to do it by hand than it was
       | before.
        
       | zackmorris wrote:
       | From what I can tell, all scalable automated work falls in value
       | towards zero over time.
       | 
       | For example, a person could write a shareware game over a few
       | weeks or months, sell it for $10, buy advertising at a $0.25
       | customer acquisition cost (CAC) and scale to make a healthy
       | income in 1994. A person could drop ship commodities like music
       | CDs and scale through advertising with a CAC of perhaps $2.50 and
       | still make enough to survive in 2004. A person could sell airtime
       | and make speaking appearances as an influencer with a CAC of $25
       | and have a good chance of affording an apartment in 2014. A
       | person can network and be part of inside deals and make a million
       | dollars yearly by being already wealthy in a major metropolitan
       | city with a CAC of $250 in 2024.
       | 
       | The trend is that work gets harder and harder for the same pay,
       | while scalable returns go mainly to people who already have
       | money. AI will just hasten the endgame of late stage capitalism.
       | 
       | Note that not all economic systems work this way. Isn't it odd
       | how tech that should be simplifying our lives and decreasing the
       | cost of living is just devaluing our labor to make things like
       | rent more expensive?
        
         | rachofsunshine wrote:
         | It's only odd if you model economics as a cooperative venture
         | between a society trying to build better collective outcomes,
         | and not as a competitive system. Additional capability and
         | information can never hurt a single actor taken in isolation.
         | But added capability and information given to multiple actors
         | in a competitive game can make them all worse off.
         | 
         | As a simple example, imagine a Prisoner's Dilemma, except
         | neither side knows defecting is an option (so in effect both
         | players are playing a single-move game where "cooperate" is the
         | only option). Landing on cooperate-cooperate in this case is
         | easy (indeed, it's the only possible outcome). But as soon as
         | you reveal the ability to defect to both players, the defect-
         | defect equilibrium becomes available.
        
         | oblio wrote:
         | If you read the Black Swan by Taleb, it stops being weird. He
         | points this out and dubs it Extremistan, where small advantages
         | accrue to oversized returns.
         | 
         | We will need humane solutions to this, because the non humane
         | ones are starting to become visible (armed drone swarms driven
         | by AI).
        
       | zombiwoof wrote:
       | Definitely the future of Twitter
        
       | gnicholas wrote:
       | Won't people who hate ads just choose to cut out the middleman
       | and use 4o mini on their own?
        
         | yard2010 wrote:
         | What makes you think it won't have ads or "sponsored" content?
        
       | K0balt wrote:
       | The enshitification of search will drive queries directly to AI,
       | either local or centralised. This will provide a before unknown
       | nexus of opinion/ perception / idea control as the primary
       | research tool will no longer return a spectrum of differing ideas
       | and references, but rather a consolidated opinion formed by the
       | AIs operators.
       | 
       | This has really dystopian vibes, since it centralizes opinion and
       | "factuality" in an authoritative but potentially extremely biased
       | or even manipulatively deceptive manner.
       | 
       | OTOH it will provide opportunities for competitive solutions to
       | query answering.
        
       | winddude wrote:
       | Ad blockers, 30-40% of internet users. getting traffic... if it
       | was that easy everyone would do it, and diminishing returns.
        
       | Havoc wrote:
       | Don't think you're getting 50k views pm in finance space with
       | some "you're an expert blog writer" AI spiel
        
       | mska wrote:
       | When views are low the math doesn't make sense but it is very
       | possible to get a lot of views through AI generated + human
       | reviewed content.
       | 
       | We're trying to do that with PulsePost (https://pulsepost.io) and
       | the biggest challenge is unique content. Given a keyword or a
       | niche topic, AI models tend to generate similar content within
       | similar subjects. Changing the temperature helps to a degree but
       | the biggest difference comes from adding internet access. Even
       | with same prompt, if the model can access the internet, it can
       | find unique ideas within the same topic and with human review it
       | becomes a high value article.
        
       ___________________________________________________________________
       (page generated 2024-07-19 23:03 UTC)