[HN Gopher] OpenAI shuts down its AI Classifier due to poor accu...
       ___________________________________________________________________
        
       OpenAI shuts down its AI Classifier due to poor accuracy
        
       Author : cbowal
       Score  : 351 points
       Date   : 2023-07-25 14:34 UTC (8 hours ago)
        
 (HTM) web link (decrypt.co)
 (TXT) w3m dump (decrypt.co)
        
       | [deleted]
        
       | spandrew wrote:
       | The only way to prevent AI from answering questions in digital
       | platforms is to develop a ML db on the typing style of every
       | student across their tenure at an institution. Good luck getting
       | that approved -- departments can't even access grade or demo data
       | without a steering group going through a 3-deep committee
       | process.
       | 
       | -\\_(tsu)_/- try paper I guess. Time to brush up on our OCR.
        
         | humanistbot wrote:
         | If AI can replicate linguistic patterns in a way that is
         | undetectable for both humans and models, then it seems even
         | easier for a ML model to emulate a natural typing style,
         | rhythm, and cadence in a way that is undetectable for both
         | humans and models.
         | 
         | But you know who has more real-world data on typing style?
         | Google, Microsoft, Meta, and everyone else who runs SaaS docs,
         | emails, or messaging. I imagine a lot of students write their
         | essays on Google Docs, Word, or the like, and submit them as
         | attachments or copy-paste into a textbox.
        
       | yk wrote:
       | Funny incentive problem, OpenAI obviously has an incentive to use
       | it's best AI detection tool for adversarial training, with the
       | result that it's detection tool will not be very good against
       | chatGPT generated text because it is trained to defeat the
       | detection tool.
        
       | JohnMakin wrote:
       | We shattered the turing test and now we want to put it back into
       | pandora's box because we don't like the repercussions.
        
         | pixl97 wrote:
         | We asked the question 'can we beat the turning test', not what
         | would happen when we did.
        
       | chakintosh wrote:
       | This tool's been fueling tons of false accusations in academia.
       | Wife is doing her PhD and she often tells me stories about
       | professors falsely accusing students of using ChatGPT.
        
         | practice9 wrote:
         | Lots of stories on Reddit about school teachers unfairly
         | accusing students of using ChatGPT for assignments too
        
         | Qqqwxs wrote:
         | A group of students at my university were claiming their papers
         | were being marked by a LLM. They cited a classifier like the OP
         | which they used on their feedback comments.
        
         | paxys wrote:
         | It isn't just this one. There are a hundred different "AI
         | detectors" sold online that are all basically snake oil, but
         | overzealous professors and school administrators will keep
         | paying for them regardless.
        
         | VeninVidiaVicii wrote:
         | Eh I am doing my PhD and I use ChatGPT all the time!
        
           | gigglesupstairs wrote:
           | And?
           | 
           | The tool in question was used for AI text detection not
           | generation.
        
             | KeplerBoy wrote:
             | Not all people might be accused wrongly. Then again does it
             | matter if you use ChatGPT for inspiration?
        
       | PeterStuer wrote:
       | It's an open ended red-queen problem. You can't win.
       | 
       | Besides, even if they did win, they would still lose by shooting
       | their own foot.
        
       | TheCaptain4815 wrote:
       | I'm in the SEO game and I've spoken with some 'heavy players' who
       | believe a "Google Ai Update" is in the works. As it currently
       | stands, the search engine results will be completely overtaken by
       | Ai content in the near future without this.
       | 
       | From my understanding, this is a fools play in the long run, but
       | there are current Ai Classifier Detectors that can successfully
       | detect ChatGPT and other models (Originality.ai being a big one)
       | on longish content.
       | 
       | Their process is fairly simple, they create a classification
       | model after generating tons of examples from all the major models
       | (ChatGPT, GPT4, Laama, etc).
       | 
       | One obvious downside to their strategy is the implementation of
       | Finetuning and how that changes the stylistic output. This same
       | 'heavy hitter' has successfully bypassed Originalities detector
       | using his specified finetuning method (which he said took months
       | of testing and thousands of dollars).
        
         | bearjaws wrote:
         | Google needs to do a full 180, and only the most succinct
         | website that answers search queries should be elevated.
         | 
         | The current state of Google is a disaster, everything is 100
         | paragraphs per article, the answer you are looking for buried
         | half way in to make sure you spend more time and scroll to
         | appease the algorithm.
         | 
         | I cannot wait for them to sink all these spam websites.
        
           | pixl97 wrote:
           | Waiting for Google to do that won't happen, they'd lose too
           | many ad links.
        
       | vorticalbox wrote:
       | Could we not add invisible characters into the text a bit like a
       | water mark?
        
         | nomel wrote:
         | Yes, and we could just as easily remove them.
        
           | vorticalbox wrote:
           | True but I feel it would catch a few people out?
        
       | squarefoot wrote:
       | How could we have both the AI that is indistinguishable from
       | humans _and_ the AI that can detect it with good accuracy? That
       | would imply the race on both sides for an AI that is more
       | intelligent than an high IQ human.
        
         | jerf wrote:
         | Text is a very high dimensional space, n-dimensional in fact.
         | There is plenty of room for an AI to leave a fingerprint that
         | can be detected in some ways but not others.
         | 
         | In fact it doesn't take much text to distinguish between two
         | human beings. The humanly-obvious version is that someone that
         | habitually speaks in one dialect and someone else in another
         | must be separate, but even without such obvious tells humans
         | separate themselves into characterizeable subsets of this space
         | fairly quickly.
         | 
         | I'm skeptical about generalized AI versus human detection in
         | the face of the fact that it is adversarial. But a constant,
         | unmoving target of some specific AI in some particular mode
         | would definitely be detectable; e.g., "ChatGPT's current
         | default voice" would certainly be detectable, "ChatGPT when
         | instructed to sound like Ernest Hemmingway" would be
         | detectable, etc. I just question whether ChatGPT _in general_
         | can be characterized.
        
       | seeknotfind wrote:
       | Watermarking is a more tractable approach, but the cat is out of
       | the bag.
        
         | cwkoss wrote:
         | only tractable for closed source hosted LLMs
        
         | sydon wrote:
         | How would you go about watermarking AI written text?
        
           | cateye wrote:
           | One way is trying to sneak in a specific structure/pattern
           | that is difficult for a human to notice when reading, like
           | using a particular sentence length, paragraph length, or
           | punctuation pattern. Or use certain words in the text that
           | may not be frequently used by humans etc.
           | 
           | Watermarking needs to be subtle enough to be unnoticeable to
           | opposing parties, yet distinctive enough to be detectable.
           | 
           | So, this is an arms race especially because detecting it and
           | altering it based on the watermark is also fun :)
        
             | nonethewiser wrote:
             | > One way is trying to sneak in a specific
             | structure/pattern that is difficult for a human to notice
             | when reading
             | 
             | This seems like a total non-starter. That can only
             | negatively impact the answers. A solution needs to be
             | totally decoupled from answer quality.
        
               | thewataccount wrote:
               | The paper I linked in the parent's comment as the "Simple
               | proof of concept" on page 2, and like you said outlines
               | it's limitations as both negative to performance and also
               | easily detectable and determinable.
               | 
               | Their improved method instead only replaces tokens when
               | there's many good choices available, and skips replacing
               | tokens when there are few good choices. "The quick brown
               | fox jumps over the lazy dog" - "The quick brown" is not
               | replaceable because it would severely harm the quality.
               | 
               | Essentially it's only replacing tokens where it won't
               | harm the performance.
               | 
               | It's worth noting that any watermarking will likely harm
               | the quality to some degree - but it can be minimized to
               | the point of being viable.
        
               | yttribium wrote:
               | You can do this by injecting non visible unicode (LTR /
               | RTL markers, zero width separators, the various "space"
               | analogs, homographs of "normal" characters) but it can
               | obviously be stripped out.
        
           | brucethemoose2 wrote:
           | Make half of the tokens (the AI's "dictionary") slightly more
           | likely.
           | 
           | This would not impact output quality much, but it would only
           | work for longish outputs. And the token probability "key"
           | could probsbly be reverse engineered with enough output.
        
             | pixl97 wrote:
             | It would be pretty easy to figure out against standard word
             | probability in average datasets. Even then the longer this
             | system runs the more likely it is to pollute its own
             | dataset by people learning to write from gpt itself.
        
           | thewataccount wrote:
           | https://arxiv.org/pdf/2301.10226.pdf
           | 
           | Here's a decent paper on it.
           | 
           | It covers private watermarking (you can't detect it exists
           | without a key), resistance to modifications, etc. Essentially
           | you wouldn't know it was there and you can't make simple
           | modifications to fool it.
           | 
           | OpenAI could already be doing this, and they could be
           | watermarking with your account ID if they wanted to.
           | 
           | The current best countermeasure is likely paraphrasing
           | attacks https://arxiv.org/pdf/2303.11156.pdf
        
           | doctorpangloss wrote:
           | I don't know.
           | 
           | I suppose hosted solutions like ChatGPT could offer an API
           | where you copy some text in, and it searches its history of
           | generated content to see if anything matches.
           | 
           | > bUt aCtuAlLy...
           | 
           | It's not like I don't know the bajillion limitations here.
           | There are many audiences for detection. All of them are XY
           | Problems. And the people asking for this stuff don't
           | participate on Hacker News aka Unpopular Opinions Technology
           | Edition.
           | 
           | There will probably be a lot of "services" that "just" "tell
           | you" if "it" is "written by an AI."
        
           | taneq wrote:
           | type=text/chatgpt :P
        
           | mepian wrote:
           | If it's generated by a SaaS, the service could sign all
           | output with a public key.
        
             | meandmycode wrote:
             | This isn't a watermark though, the idea of a watermark is
             | that it's inherently embedded in the data itself while not
             | drastically changing the data
        
             | csmpltn wrote:
             | Why is this comment being downvoted?
             | 
             | OpenAI can internally keep a "hash" or a "signature" of
             | every output it ever generated.
             | 
             | Given a piece of text, they should then be able to trace
             | back to either a specific session (or a set of sessions)
             | through which this text was generated in.
             | 
             | Depending on the hit rate and the hashing methods used,
             | they may be able to indicate the likelihood of a piece of
             | text being generated by AI.
        
               | pixl97 wrote:
               | Why would they want to is my question. A single character
               | change would break it.
               | 
               | Then you have database costs of storing all that data
               | forever.
               | 
               | Moreso, it's only for openAI, I don't think it will be
               | too long before other gpt4 level models are around and
               | won't give two shits about catering to the AI
               | identification police.
        
               | csmpltn wrote:
               | > A single character change would break it.
               | 
               | That depends on how they hash the data, right? They can
               | use various types of Perceptual Hashing [1] techniques
               | which wouldn't be susceptible to a single-character
               | change.
               | 
               | [1] https://en.wikipedia.org/wiki/Perceptual_hashing
               | 
               | > Then you have database costs of storing all that data
               | forever.
               | 
               | A database of all textual content generated by people?
               | That sounds like a gold mine, not a liability. But as
               | I've mentioned earlier, they don't need to keep the raw
               | data (a perceptual hash is enough).
               | 
               | > won't give two shits about catering to the AI
               | identification police
               | 
               | I'm sure there will be customers willing to pay for
               | access to these checks, even if they're only limited to
               | OpenAI's product (universities and schools - for
               | plagiarism detection, government agencies, intelligence
               | agencies, police, etc).
        
             | hackernewds wrote:
             | So other text should not be tagged as AI generated?
        
               | [deleted]
        
           | merlincorey wrote:
           | Invisible characters in a specific bit-pattern.
           | 
           | Pretty common steganographic technique, really.
        
             | philipov wrote:
             | So, all I have to do is copy-paste it into a text editor
             | with remove-all-formatting to circumvent that?
        
             | nonethewiser wrote:
             | Can you elaborate on "invisible?" The only invisible
             | character I can imagine is a space. It seems like any other
             | character either isn't invisible or doesn't exist (ie, isnt
             | a character).
             | 
             | Additionally, if I copy-paste text like this are the
             | invisible characters preserved? Are there a bunch of extra
             | spaces somewhere?
        
               | michaelt wrote:
               | When students try to evade plagiarism detectors, they
               | will swap characters like replacing spaces with
               | nonbreaking spaces, replacing letters with lookalikes (I
               | vs extended Cyrillic ` etc), and inserting things like
               | the invisible 'Combining Grapheme Joiner'
               | 
               | IMHO it isn't a feasible way of watermarking text though
               | - as someone would promptly come up with a website that
               | undid such substitutions.
        
               | jstarfish wrote:
               | > IMHO it isn't a feasible way of watermarking text
               | though - as someone would promptly come up with a website
               | that undid such substitutions.
               | 
               | It doesn't matter since there's no one-pass solution to
               | counterfeiting.
               | 
               | You have the right of it-- the best you can hope for is
               | adding more complexity to the product, which adds steps
               | to their workflow and increases the chances of the
               | counterfeiter overlooking any particular detail that you
               | know to look for.
        
               | lucasmullens wrote:
               | There's a bunch of different "spaces", one is a "zero-
               | width space" which isn't visible but still gets copied
               | with the text.
               | 
               | https://en.wikipedia.org/wiki/Zero-width_space
        
               | pixl97 wrote:
               | And the second site students will go to is
               | zerospaceremover.com or whatever will show up to strip
               | the junk.
        
       | jedberg wrote:
       | Latest I heard is that teachers are requiring homework to be
       | turned in in Google Docs so that they can look at the revision
       | history and see if you wrote the whole thing or just dumped a
       | fully formed essay into GDocs and then edited it.
       | 
       | Of course the smart student will easily figure out a way to
       | stream the GPT output into Google Docs, perhaps jumping around to
       | make "edits".
       | 
       | A clever and unethical student is pretty much undetectable no
       | mater what roadblocks you put in their way. This just stops the
       | not clever ones. :)
        
         | matteoraso wrote:
         | >Of course the smart student will easily figure out a way to
         | stream the GPT output into Google Docs, perhaps jumping around
         | to make "edits".
         | 
         | No need to complicate it that much. Just start off writing an
         | essay normally, and then paste in the GPT output normally. A
         | teacher probably isn't going to check any of the revision
         | history, especially if there's more than 30 students to go
         | through.
        
         | justrealist wrote:
         | Well, there is one way, which is timed, proctored exams.
         | 
         | Which sucks, because take-home projects are evaluating a
         | different skill set, and some people thrive on one vs the
         | other. But it is what it is.
        
         | ribosometronome wrote:
         | Retyping the essay from the chatgpt while actively rewording
         | the occassional sentence seems like it would do it.
        
           | mplewis wrote:
           | It's a bit suspicious to type an essay linearly from start to
           | finish, though.
        
             | antod wrote:
             | A bit like how us old timers had to write our exams in the
             | pen and paper days.
        
             | JohnFen wrote:
             | Is it? That's how I've always written them (and still do to
             | this day). I write the first draft linearly from start to
             | end, then go back and do my revising and editing.
        
           | wmeredith wrote:
           | It seems like that's nearing the sweet spot of fraud
           | prevention, where committing the act of fraud is as much work
           | as doing the real thing.
        
             | Nickersf wrote:
             | The intellectual labor of thinking about an essay, drafting
             | it, editing, and revising it is much higher than
             | strategically re-typing a ChatGPT output. One requires
             | experience, knowledge, understanding and creativity, the
             | other one requires basic functioning of motor skills and
             | senses.
             | 
             | You could program a robot to re-type the ChatGPT output
             | into a different word processor and feed it parameters to
             | make the duration between keystrokes and backspaces
             | fluctuate over time. You could even have it stop, come back
             | later, copy and paste sections and re-organize as it moves
             | through and end up with the final essay from ChatGPT.
        
             | frumper wrote:
             | It sounds a lot easier to retype what you see rather than
             | to create it.
        
         | marcell wrote:
         | This will be hard to break. It's basically an hour long
         | CAPTCHA. You can look at things like key stroke timing, mouse
         | movement, revision pattern, etc. I don't see LLM's breaking
         | this approach to classify human writing.
        
           | jacquesm wrote:
           | > I don't see LLM's breaking this approach to classify human
           | writing.
           | 
           | Why not? Record a bunch of humans writing, train model,
           | release. That's orders of magnitude simpler than to come up
           | with the right text to begin with.
        
       | lacker wrote:
       | Smart of OpenAI to shut down a tool that basically doesn't work
       | before the school year starts and students start to get in
       | trouble based on it.
       | 
       | I think this upcoming school year is going to be a wakeup call
       | for many educators. ChatGPT with GPT-4 is already capable of
       | getting mostly A's on Harvard essay assignments - the best
       | analysis I have seen is this one:
       | 
       | https://www.slowboring.com/p/chatgpt-goes-to-harvard
       | 
       | I'm not sure what instructors will do. Detecting AI-written
       | essays seems technologically intractable, without cooperation
       | from the AI providers, who don't seem too eager to prioritize
       | watermarking functionality when there is so much competition. In
       | the short term, it will probably just be fairly easy to cheat and
       | get a good grade in this sort of class.
        
         | woeirua wrote:
         | Nah, everything is just going to be proctored exams on paper in
         | the future. Sucks for the pro take-home project crowd, but they
         | ruined it for themselves.
        
       | 13years wrote:
       | "Half a year later, that tool is dead, killed because it couldn't
       | do what it was designed to do."
       | 
       | This was my conclusion as well testing the image detectors.
       | 
       |  _Current automated detection isn't very reliable. I tried out
       | Optic's AI or Not , which boasts 95% accuracy, on a small sample
       | of my own images. It correctly labeled those with AI content as
       | AI generated, but it also labeled about 50% of my own stock photo
       | composites I tried as AI generated. If generative AI was not a
       | moving target I would be optimistic such tools could advance and
       | become highly reliable. However, that is not the case and I have
       | doubts this will ever be a reliable solution._
       | 
       | from my article on AI art - https://www.mindprison.cc/p/ai-art-
       | challenges-meaning-in-a-w...
        
         | danuker wrote:
         | > but it also labeled about 50% of my own stock photo
         | composites I tried as AI generated
         | 
         | Could it be that a large proportion of the source stock photos
         | were actually AI generated?
        
           | 13years wrote:
           | No, they were older images. However, that is now becoming a
           | problem. Some stock photo sites now have AI images and they
           | are not labeled. I'm able to distinguish most for now because
           | at hires the details contain obvious errors.
           | 
           | This is really painful, because for some of my work I need
           | high quality images suitable for print. Now I can't just look
           | at the thumbnail and say "this will work". I now have to
           | examine it taking more of my time.
        
       | rootusrootus wrote:
       | Good. And I think watermarking AI output is also a dead end.
       | Better that we simply assume that all content is fake unless
       | proven otherwise. To the extent that we need trustworthy photos,
       | it seems like a better idea to cryptographically sign images at
       | the hardware level when the photo is taken. Voluntarily
       | watermarking AI content is completely pointless.
        
         | sebzim4500 wrote:
         | I can see that working for specialized equipment like police
         | body cameras, but if every camera manufacturer in the world
         | needs to manage keys and install them securely into their
         | sensors then there will be leaked keys within weeks.
        
           | ummonk wrote:
           | Just use a certificate chain. The manufacturer can provide
           | each camera its own private key, signed by the manufacturer.
        
             | explaininjs wrote:
             | And when the sensor bus is hijacked to directly feed
             | "trusted" data into the processor?
        
           | tudorw wrote:
           | Makes fake image, hold it in front of camera, click, verified
           | image...
        
             | Wingy wrote:
             | The signed timestamp and location would give that away, but
             | those would have to become not configurable by the user.
        
               | cwkoss wrote:
               | clocks and gps sensors can be hacked, there is no
               | fundamental source of truth to fall back on here.
               | 
               | its as Sisyphean a task as AI detection.
        
         | cwkoss wrote:
         | Cryptography cant save us here, people will figure out how to
         | send AI images to the crypto hardware to get it signed in
         | months. Just would be another similar layer of false security.
        
           | baby_souffle wrote:
           | > Cryptography cant save us here, people will figure out how
           | to send AI images to the crypto hardware to get it signed in
           | months.
           | 
           | Possibly (who am I kidding. *PROBABLY*!) will use chatGPT to
           | help them design the method :)
        
           | malfist wrote:
           | That's not how cryptographic signing works.
           | 
           | Cryptographic signing means "I wrote this" or "I created
           | this". Sure you could sign an AI generated image as yourself.
           | But you could not sign an image as being created by Getty or
           | NYT
        
             | esclerofilo wrote:
             | I believe that's not what they're saying. It's signing
             | hardware, like a camera that signs every picture you take,
             | so not even you can tamper with it without invalidating
             | that signature. Naively, then, a signed picture would be
             | proof that it was a real picture taken of a real thing.
             | What GP is saying is that people would inevitably get the
             | keys from the cameras, and then the whole thing would be
             | pointless.
        
               | [deleted]
        
               | cwkoss wrote:
               | Yep.
               | 
               | A chain of trust is one way to solve this problem. Chains
               | of trust aren't perfect, but they can work.
               | 
               | But if you're going to build a chain of trust that relies
               | on humans to certify they used a non-tampered-with crypto
               | camera, why not just let them use plain ol cameras.
               | Adding cryptosigning hardware just adds a false sense of
               | security that grifter salespeople will lie and say is
               | 'impossible to break', and non-technical decision makers
               | wont understand the threat model.
        
       | hospitalJail wrote:
       | Not to say "I can detect chatgpt" but it sure seems to have a
       | similar way of talking even when I say things like: Talk like a
       | "Millennial male who is obsessed with Zelda, their name is bob
       | zelenski"
       | 
       | Now the topic isnt about anything millennial or Zelda related,
       | but I'd think that the language model would select sentence and
       | paragraph phrasing differently.
       | 
       | Maybe I need to switch to the API.
        
         | post-it wrote:
         | I've also noticed that ChatGPT tends to respond to short
         | prompts, especially questions, in a predictable format. There
         | are a few characteristics.
         | 
         | First, it tends to print a five-paragraph essay, with an
         | introduction, three main points, and a conclusion.
         | 
         | Second, it signposts really well. Each of the body paragraphs
         | is marked with either a bullet point or a number or something
         | else that says "I'm starting a new point."
         | 
         | Third, it always reads like a WikiHow article. There's never
         | any subtle humour or self-deprecation or ironic understatement.
         | It's very straightforward, like an infographic.
         | 
         | It's definitely easy to recognize a ChatGPT response to a
         | simple prompt if the author hasn't taken any measures to
         | disguise it. The conclusion usually has a generic reminder that
         | your mileage may vary and that you should always be careful.
        
           | SquareWheel wrote:
           | I have to admit I'm struggling to tell if this was done
           | ironically, but your comment is exactly a five paragraph
           | essay with an introduction, three main points, and a
           | conclusion.
           | 
           | If so, nice meta-commentary.
        
             | post-it wrote:
             | Thank you, it was intentional!
        
       | stormed wrote:
       | Interesting. I was under the impression this tool was effective
       | because of some sort of hidden patterns generated in sentences. I
       | guess my assumption was way complex than what it actually is
        
       | sakopov wrote:
       | Humorously, in my experience, if a response from ChatGPT ever got
       | classified as AI generated by tools like ZeroGPT or similar, all
       | I had to do was adjust the prompt to tell the model not to sound
       | like it was AI generated and that bypassed all detection with a
       | very high success rate. Additionally, I also found that if you
       | prompt it to make the response be in the style or some known
       | writer for example, it often made responses 100% human written by
       | most AI detection models.
        
         | klabb3 wrote:
         | "Can't you just try to blend in and be a little more cool? The
         | bouncer is gonna notice."
         | 
         |  _Starts talking like Shakespeare_
        
       | mercurialsolo wrote:
       | I wonder why we need this very thing of AI generated. It's a
       | luddite view of AI. Much like the need to distinguish between
       | handcrafted versus machined products - is there a real utility to
       | knowing this?
       | 
       | For educators looking at evaluating students, essays and the like
       | - we possibly need different ways of evaluation rather than on
       | written asynchronous content for communicating concepts and
       | ideas.
        
         | klabb3 wrote:
         | I believe you're exactly right. It would be similar to
         | detecting that math homework used wolfram alpha or even a
         | calculator.
        
         | pixl97 wrote:
         | >is there a real utility to knowing this?
         | 
         | For civics, I would say yes.
         | 
         | Imagine you were talking to an online group about a design
         | project for a local neighborhood. Based on the plurality of
         | voices it seemed like mist people wanted a brown and orange
         | design. But later when you talk to actual people in real life,
         | you could only find a few that actually wanted that.
         | 
         | Virtual beings are a great addition to the bot nets that
         | generate false consensus.
        
       | wouldbecouldbe wrote:
       | If the goal is students, then the best would be a tool not only
       | detects AI. But where you can submit previous writing and see how
       | likely it is they wrote a similar text, not so much if it was llm
       | generated.
        
       | hackernewds wrote:
       | Good. If it is not reliable, it's a further harm than good if it
       | exists as a false security.
       | 
       | An analogous example: my local pizza delivery (where I worked)
       | would shut the box with a safety sticker, to avoid tampering /
       | dipping by the delivery boys. Now, sometimes they would forget to
       | do this for various logistical reasons. Every one of the non-
       | stickered ones started getting returned as customers worried a
       | pepperoni stolen. They stopped doing it shortly after.
        
         | pixl97 wrote:
         | Eh, I'd consider that a failure of employee training and
         | reverse the situation by giving out a weekly bonus to shifts
         | that did not fail to put the security stickers on.
         | 
         | Kinda like if they forgot to put the security seal on your
         | aspirin, I'm not going to take them all off because someone
         | forgot to run production with all the bottles sealed.
        
           | frumper wrote:
           | The bottle of Aspirin goes through many hands between the
           | manufacturer and you including sitting on an unattended shelf
           | open to the public. The person making the pizza is working
           | for the same company as the person delivering it, or may even
           | be the same person. If you can't trust the pizza co delivery
           | person then you probably shouldn't trust the person making it
           | either.
        
             | pixl97 wrote:
             | Your right, don't eat at that pizza place either
             | 
             | This is the brown m&m principle in effect.
        
           | mplewis wrote:
           | Frankly, the kind of person who forgets to put the sticker on
           | at the pizza place will forget about the bonus too.
        
         | klabb3 wrote:
         | It's a law of nature that pepperoni thieves cannot take a job
         | at a pizza place. They are forever doomed to be delivery guys.
        
       | kylecordes wrote:
       | There is inherent conflict in having both an AI tool business and
       | an AI tool detection business.
       | 
       | If the first does a good job, the second fails. And vice versa.
       | 
       | (On the other hand, maybe there is a lot of money to be made
       | selling both, to different groups?)
        
         | sebzim4500 wrote:
         | I don't think this follows. If they wanted, they could
         | crypographically bias the sampling to make the output
         | detectable without decreasing capabilities at all.
         | 
         | Only people using it deceptively would be affected. No idea
         | what portion of ChatGPT's users that is, would be very
         | interested to know.
        
           | zarzavat wrote:
           | There's a much more effective way: store hashes of each
           | output paragraph (with a minimum entropy) that has ever been
           | generated, and allow people to enter a block of text to
           | search the database.
           | 
           | It wouldn't beat determined users but it would at least catch
           | the unaware.
        
             | sebzim4500 wrote:
             | Changing every 10th word defeats that strategy but doesn't
             | defeat a cryptographic bias.
             | 
             | Also the cost of storing every paragraph hash might
             | eventually add up even if at the moment it would be
             | negligable compared to the generation cost.
        
               | doliveira wrote:
               | They literally already store the whole conversation...
        
               | explaininjs wrote:
               | One solution is to store a hash of every n-gram for n
               | from 2 to whatever, then report what percent of ngrams of
               | various lengths were hits.
               | 
               | Did someone say Bloom Filter??
        
       | al_be_back wrote:
       | low accuracy is certainly a good reason to drop a project,
       | especially when dealing with small text (<1000 chars), this is
       | where most social media post/mini-blogs fall under.
       | 
       | bigger text e.g. reports, thesis etc are probably easier &
       | cheaper to verify by humans, with help of A.I. tools (ref
       | checking, searching...)
        
       | kmeisthax wrote:
       | The idea that OpenAI was intentionally watermarking its output to
       | avoid training data back-contamination should be thoroughly
       | discredited now.
        
       | bilater wrote:
       | As a joke I built a simple tool that swaps random words for their
       | synonym and it did the trick in throwing off any distribution
       | matching (came out with gibberish but still lol)
       | https://www.gptminus1.com/
        
       | jillesvangurp wrote:
       | This kind of thing strikes me in any case as something that's
       | only good for the generation of AI it's been trained against. And
       | with the exponential improvements happening almost on a monthly
       | basis, that becomes obsolete pretty quickly and a bit of a moving
       | target.
       | 
       | Maybe a better term would be Superior Intelligence (SI). I sure
       | as hell would not be able to pass any legal or medical exams
       | without dedicating the next decade or so to getting there. Nor do
       | I have any interest in doing so. But chat gpt 4 is apparently
       | able to wow its peers. Does that pass the Turing test because
       | it's too smart or too stupid? Most of humanity would fail that
       | test.
        
       | hayd wrote:
       | Was this what Stack Overflow were using to detect automated
       | answers?
        
         | Ukv wrote:
         | None were officially built into the site so it'll vary from
         | moderator to moderator, but the one that mods had a browser
         | script made for to help streamline moderation was RoBERTa Base
         | OpenAI Detector from 2019, created prior to the existence of
         | GPT-3, GPT-3.5 (ChatGPT free), or GPT-4 (ChatGPT pro). It'll be
         | far worse than the 2023 one this article is about.
        
       | capableweb wrote:
       | I'm glad that they did, although they should obviously done an
       | announcement for it.
       | 
       | The amount of people in the ecosystem who thinks it's even
       | possible to detect if something is AI written or not when it's
       | just a couple of sentences is staggering high. And somehow,
       | people in power seems to put their faith in some of these tools
       | that guarantee a certain amount of truthfulness when in reality
       | it's impossible they could guarantee that, and act on whatever
       | these "AI vs Human-written" tool tell them to.
       | 
       | So hopefully this can serve as another example that it's simply
       | not possible to detect if a bunch of characters were outputted by
       | an LLM or not.
        
         | constantcrying wrote:
         | Even the idea of it is bad, ChatGPT is _supposed to_ write
         | indistinguishably from a human.
         | 
         | The "detector" has extremely little information and the only
         | somewhat reasonable criteria are things like style, where
         | ChatGPT certainly has a particular, but by no means unique
         | writing style. And as it gets better it will (by definition) be
         | better at writing in more varied styles.
        
           | RandomLensman wrote:
           | Why even care if it is written by a machine or not? I am not
           | sure it matters as much as people think.
        
             | JohnFen wrote:
             | There are a number of reasons people may care. For
             | instance, the thing about art that appeals to me is that
             | it's human communication. If it's machine generated, then I
             | want to know so that I can properly contextualize it (and
             | be able to know whether or not I'm supporting a real person
             | by paying for it).
             | 
             | A world where I can't tell if something is made by human or
             | by machine is a world that has been drained of something
             | important to me. It would reduce the appeal of all art for
             | me and render the world a bit less meaningful.
        
               | RandomLensman wrote:
               | Fair, but I think that will shake out easier than
               | expected: if there is a market (i.e. it is being valued)
               | for certain things human generated people will work on
               | being able to authenticate their output. Yes, there will
               | likeky be fraud etc., but if there is a reasonable market
               | it has a good chance of working because it serves all
               | participants.
        
             | aleph_minus_one wrote:
             | > Why even care if it is written by a machine or not? I am
             | not sure it matters as much as people think.
             | 
             | You don't see the writing on the wall? OK, here is big
             | hint: it might make a huge difference from a legal
             | perspective whether some "photo" showing child sexual abuse
             | (CSA) was generated using a camera and a real, physical
             | child, or by some AI image generator.
        
               | RandomLensman wrote:
               | I don't think all jurisdictions make that distinction to
               | start with and even if they did and societies really
               | wanted to go there: not sure why a licensing regime on
               | generators with associated cryptographic information in
               | the images could not work. We don't have to be broadly
               | permissive, if at all.
        
           | toss1 wrote:
           | Yup.
           | 
           | Moreover, the way to deal with AI in this context is not like
           | the way to deal with plagiarism; do _not_ try to detect AI
           | and punish its use.
           | 
           | Instead, assign it's use, and have the students critique the
           | output and find the errors. This both builds skills in using
           | a new technology, and more critically, builds the essential
           | skills of vigilance for errors, and deeper understanding of
           | the material -- really helping students strengthen their BS
           | detectors, a critical life skill.
        
           | siglesias wrote:
           | I'd challenge this assumption. ChatGPT is supposed to convey
           | information and answer questions in a manner that is
           | intelligible to humans. It doesn't mean it should write
           | indistinguishably from humans. It has a certain manner of
           | prose that (to me) is distinctive and, for lack of a better
           | descriptor, silkier, more anodyne, than most human writing.
           | It should only attempt a distinct style if prompted to.
        
             | constantcrying wrote:
             | ChatGPT is explicitly trained on _human writing_ it 's
             | training goal is explicitly to emulate human writing.
             | 
             | >It should only attempt a distinct style if prompted to.
             | 
             | There is no such thing as an indistinct style. Any
             | particular style it could have would be made distinct by it
             | being the style ChatGPT chooses to answer in.
             | 
             | The answers that ChatGPT gives are usually written in a
             | style combining somewhat dry academic prose and the type of
             | writing you might find in a Public Relations statement.
             | ChatGPT sounds very confident in the responses it generates
             | to the queries of users, even if the actual content of the
             | information is quite doubtful. With some attention to
             | detail I believe that it is quite possible for humans to
             | emulate that style, further I believe that the style was
             | designed by the creators of ChatGPT to make the output of
             | the machine learning algorithm seem more trustworthy.
        
             | JustBreath wrote:
             | That's true, you could even purposely inject fingerprinting
             | into its writing style and it could still accomplish the
             | goal of conveying information to people.
        
               | insanitybit wrote:
               | All I would have to do is run the same tool over the
               | text, see it gets flagged, and then modify the text until
               | it no longer gets flagged. That's assuming I can't just
               | prompt inject my way out of the scenario.
        
               | littlestymaar wrote:
               | But then that wouldn't be "detecting AI", but merely
               | recognizing an intentionally added fingerprint, which
               | sounds far less attractive...
        
           | Teever wrote:
           | Nitpick: ChatGPR is supposed to write in a way that is
           | indistinguishable from a human, to another human.
           | 
           | That doesn't mean that it can't be distguishable by some
           | other means.
        
             | CookieCrisp wrote:
             | I think for small amounts of text there's no way around it
             | being indistinguishable to a machine and not
             | distinguishable to a human. There just aren't that many
             | combinations of words that still flow well. Furthermore as
             | more and more people use it I think we'll find some humans
             | changing their speech patterns subconsciously more to mimic
             | whatever it does. I imagine with longer text there will be
             | things they'll be able to find, but, I think it will end up
             | being trivial for others to detect what those changes are
             | and then modifying the result enough to be undetectable.
        
               | jerf wrote:
               | I think for this sort of problem it is more productive to
               | think in terms of the amount of text necessary for
               | detection, and how reliable such a detection would be,
               | than a binary can/can't. I think similarly for how
               | "photorealistic" a particular graphics tech is; many
               | techs have already long passed the point where I can tell
               | at 320x200 but they're not necessarily all there yet at
               | 4K.
               | 
               | LLMs clearly pass the single sentence test. If you
               | generate far more text than their window, I'm pretty sure
               | they'd clearly fail as they start getting repetitive or
               | losing track of what they've written. In between, it
               | varies depending on how much text you get to look at. A
               | single paragraph is pretty darned hard. A full essay
               | starts becoming something I'm more confident in my
               | assessment.
               | 
               | It's also worth reminding people that LLMs are more than
               | just "ChatGPT in its standard form". As a human trying to
               | do bot detection sometimes, I've noticed some tells in
               | ChatGPT's "standard voice" which almost everyone is still
               | using, but once people graduate from "Write a blog post
               | about $TOPIC related to $LANGUAGE" to "Write a blog post
               | about $TOPIC related to $LANGUAGE in the style of Ernest
               | Hemmingway" in their prompts it's going to become very
               | difficult to tell by style alone.
        
             | leonardtang wrote:
             | Precisely -- watermarks are an obvious example of this. To
             | me, this is THE path forward for AI content detection.
        
               | WillPostForFood wrote:
               | Watermarking text can't work 100% and will have false
               | negatives and false positives. It is worse than nothing
               | in many situations. It is nice when the stakes are low,
               | but when you really need it you can't rely on it.
        
             | bob-09 wrote:
             | If a human can't verify whether distinguished text is
             | actually AI or not, detection will be full of false
             | positives and ultimately unreliable.
        
           | bko wrote:
           | Copying a comment I posted a while ago:
           | 
           | I listened to a podcast with Scott Aaronson that I'd highly
           | recommend [0]. He's a theoretical computer scientist but he
           | was recruited by OpenAI to work on AI safety. He has a very
           | practical view on the matter and is focusing his efforts on
           | leveraging the probabilistic nature of LLMs to provide a
           | digital undetectable watermark. So it nudges certain words to
           | be paired together slightly more than random and you can
           | mathematically derive with some level of certainty whether an
           | output or even a section of an output was generated by the
           | LLM. It's really clever and apparently he has a working
           | prototype in development.
           | 
           | Some work arounds he hasn't figured out yet is asking for an
           | output in language X and then translating it into language Y.
           | But those may still be eventually figured out.
           | 
           | I think watermarking would be a big step forward to practical
           | AI safety and ideally this method would be adopted by all
           | major LLMs.
           | 
           | That part starts around 1 hour 25 min in.
           | 
           | > Scott Aaronson: Exactly. In fact, we have a pseudorandom
           | function that maps the N-gram to, let's say, a real number
           | from zero to one. Let's say we call that real number ri for
           | each possible choice i of the next token. And then let's say
           | that GPT has told us that the ith token should be chosen with
           | probability pi.
           | 
           | https://axrp.net/episode/2023/04/11/episode-20-reform-ai-
           | ali...
        
             | constantcrying wrote:
             | I think the chance of this working reliably is precisely
             | zero. There are multiple trivial attacks against this and
             | it _can not_ work if the user has any kind of access to
             | token level data (where he could trivially write his own
             | truly random choice). And if there is a non-water marking
             | neural network with enough capacity to do simple rewriting
             | you can easily remove any watermark or the user does the
             | minor rewrite himself.
        
             | concurrentsquar wrote:
             | This, or cryptographic signing (like what the C2PA
             | suggests) of all real digital media on the Earth are the
             | only ways to maintain consensus reality
             | (https://en.wikipedia.org/wiki/Consensus_reality) in a
             | post-AI world.
             | 
             | I personally would want to live in Aaronson's world, and
             | not the world where a centralized authority controls the
             | definition of reality.
        
               | greiskul wrote:
               | How can we maintain consensus reality, when it has never
               | existed? There are a couple of bubbles of humanity where
               | honesty and skepticism and valued. Everywhere else, at
               | all moments of history, truth has been manipulated to
               | subjugate people. Be it newspaper owned by polical
               | families, priests, etc.
        
             | air7 wrote:
             | I heard of this (very neat) idea and gave it some thought.
             | I think it can work very well in the short term. Perhaps
             | OpenAI has already implemented this and can secretly detect
             | long enough text created by GPT with high levels of
             | accuracy.
             | 
             | However, as soon a detection tool becomes publicly
             | available (or even just the knowledge that watermarking has
             | been implemented internally), a simple enough garbling LLM
             | would pop up that would only need to be smart enough to
             | change words and phrasing here and there.
             | 
             | Of course these garbling LLMs could have a watermark of
             | their own... So it might turn out to be a kind of cat-and-
             | mouse game but with strong bias towards the mouse, as FOSS
             | versions of garblers would be created or people would
             | actually do _some_ work manually, and make the changes by
             | hand.
        
               | constantcrying wrote:
               | There are already quite complex language models which can
               | run on a CPU. Outside of the government banning personal
               | LLMs, the chance of there not existing a working fully
               | FOSS and open data rewrite model, if it becomes known
               | that ChatGPT output is marked, seems very low.
               | 
               | The water marking techniques also can not work after some
               | level of sophisticated rewriting. There simply will be no
               | data encoded in the probabilities of the words.
        
               | motti wrote:
               | If it's sophisticatedly rewritten then it's no longer AI
               | generated
        
             | ummonk wrote:
             | This would be trivially broken once sufficiently good open
             | source pretrained LLMs become available, as bad actors
             | would simply use unwatermarked models.
        
             | vunderba wrote:
             | Even if you could force the bad actors to use this
             | watermarked large language model, there's no guarantee that
             | they couldn't immediately feed that through Langchain into
             | a different large language model that would render all the
             | original watermarks useless.
        
           | insanitybit wrote:
           | I tried an experiment when GPT4 allowed for browsing. I sent
           | it my website and asked it to read my blog posts, then to
           | write a new blog post in my writing style. It did an ok job.
           | Not spectacular but it did pick up on a few things (I use a
           | lot of -'s when I write).
           | 
           | The point being that it's already possible to change
           | ChatGPT's tone significantly. Think of how many people have
           | done "Write a poem but as if <blah famous person> wrote it".
           | The idea that ChatGPT could be reliably detected is kind of
           | silly. It's an interesting problem but not one I'd feel
           | comfortable publishing a tool to solve.
        
         | feoren wrote:
         | Indeed it's not possible. Say you had a classifier that
         | detected whether a given text was AI generated or not. You can
         | easily plug this classifier into the end of a generative
         | network trying to fool it, and even backpropagate all the way
         | from the yes/no output to the input layer of the generative
         | network. Now you can easily generate text that fools that
         | classifier.
         | 
         | So such a model is doomed from the start, unless its parameters
         | are a closely-guarded secret (and never leaked). Then it means
         | it's foolable by those with access and nobody else. Which means
         | there's a huge incentive for adversaries to make their own,
         | etc. etc. until it's just a big arms race.
         | 
         | It's clear the actual answer needs to be: we need better
         | automated tools to detect _quality content_ , whatever that
         | might mean, whether written by a human or an AI. That would be
         | a godsend. And if it turned into an arms race, the arms we're
         | racing each other to build are just higher-quality content.
        
         | catboybotnet wrote:
         | There's also the post going around about how it can (and does)
         | falsely flag human posts as AI output, particularly among some
         | autistic people. About as useful as a polygraph, no?
        
           | LordDragonfang wrote:
           | TBH, a properly-administered polygraph is probably _more_
           | accurate than OpenAI 's detector (of course, "properly
           | administered" requires the subject to be cooperative and
           | answer very simple yes or no questions, because a poly
           | measures subconscious anxiety, not "truth")
        
             | carapace wrote:
             | Polygraph is pseudo-science, it measures nothing.
        
               | LordDragonfang wrote:
               | I mean, it literally and factually measures multiple your
               | body's autonomous responses - all of which are provably
               | correlated with stress. That's what a polygraph machine
               | _is_. Saying it measures _nothing_ is factually
               | incorrect.
               | 
               | You can't detect "truth" from that, but you can often
               | tell (i.e. with better accuracy than chance) whether or
               | not a subject is able to give a confident, uncomplicated
               | yes-or-no to a straightforward question in a situation
               | where they don't have to be particularly nervous (which
               | is why it's not very useful for interrogating a stressed
               | criminal suspect, and should absolutely be inadmissible
               | in court).
               | 
               | But everyone knows that it's not very reliable in almost
               | every circumstance it's used. My point is that while only
               | marginally better than chance, it's still _better_ than
               | chance, unlike the OpenAI 's detector, which is
               | _significant worse_ than chance.
        
           | hef19898 wrote:
           | We could combine those, couldn't we?
        
             | TheSpiceIsLife wrote:
             | Some kind of Voigt-Kampff Test, perhaps.
        
               | moffkalast wrote:
               | Something something cells, interlinked.
        
             | yowlingcat wrote:
             | You could but is there any reason to believe these two
             | noisy signals wouldn't result in more combined noise than
             | signal?
             | 
             | Sure, it's theoretically possible to add two noisy signals
             | that are uncorrelated and get noise reduction, but is it
             | probable this would be such a case?
        
               | cconstantine wrote:
               | Yes, you can :)
               | 
               | It all depends on the properties of the signal and the
               | noise. In photography you can combine multiple noisy
               | images to increase the signal to noise ratio. This works
               | because the signal increases O(N) with the number of
               | images but the noise only increases O(sqrt(N)). The
               | result is that while both signal and noise are
               | increasing, the signal is increasing faster.
               | 
               | I have no idea if this idea could be used for AI
               | detection, but it is possible to combine 2 noisy signals
               | and get better SNR.
        
               | NeoTar wrote:
               | If the noisy signals are not completely correlated then
               | the signal would be enhanced; however in this case I
               | imagine that there is likely to be a strong correlation
               | between different tools which would mean adding
               | additional sources may not be so useful.
        
           | capableweb wrote:
           | Both false-positives are as useful as the other one, flagged
           | "human" but actually "LLM" vs flagged "LLM" but actually
           | "human". As long as no one put too much weight on the result,
           | no harm would have been done, in either case. But clearly,
           | people can't stay away from jumping to conclusions based on
           | what a simple-but-incorrect tool says.
        
             | arcticbull wrote:
             | Seems a tautology no? "As long as we ignore the results the
             | results don't matter."
        
             | frumper wrote:
             | A tool that gives incorrect and inconsistent results
             | shouldn't have any part of a decision making process. There
             | is no way to know when it's wrong so you'll either use it
             | to help justify what you want, or ignore it.
             | 
             | Edit: this tool is as reliable as a magic 8-ball
        
               | bgirard wrote:
               | > A tool that gives incorrect and inconsistent results
               | shouldn't have any part of a decision making process.
               | 
               | It can be used for some decision (i.e. not critical
               | ones), but it should NOT be used to accused someone of
               | academic misconduct unless the tool meets a very robust
               | quality standard.
               | 
               | > this tool is as reliable as a magic 8-ball
               | 
               | Citation needed
        
               | frumper wrote:
               | The AI tool doesn't give accurate results. You don't know
               | when it's not accurate. There is no accurate way to check
               | its results. Who should use a tool to help them make a
               | decision when you don't know when the tool will be wrong
               | and it has a low rate of accuracy? It's in the article.
        
               | bgirard wrote:
               | > The AI tool doesn't give accurate results.
               | 
               | Nearly everything doesn't give 100% accurate results.
               | Even CPUs have had bugs their calculation. You have to
               | use a suitable tool for a suitable job with the correct
               | context while understanding it's limitation to apply it
               | correctly. Now that is proper engineering. You're
               | partially correctly but you're overstating:
               | 
               | > A tool that gives incorrect and inconsistent results
               | shouldn't have any part of a decision making process.
               | 
               | That's totally wrong and an overstated position.
               | 
               | A better position is that some tools have such a low
               | accuracy rate that they shouldn't be used for their
               | intended purpose. Now that position I agree with it. I
               | accept that CPUs may give incorrect results due to a
               | cosmic ray event, but I wouldn't accept a CPU that gives
               | the wrong result for 1/100 instructions.
        
               | frumper wrote:
               | The thread is about tools to evaluate LLMs. Please re-
               | read my comment in that light and generously assume I'm
               | talking about that.
        
               | __loam wrote:
               | Your comment applies to all these tools though lol. No
               | need to clarify, it's all a probabilistic machine that's
               | very unreliable.
        
               | CamperBob2 wrote:
               | ChatGPT isn't the only AI. It is possible, and
               | inevitable, to train other models specifically to avoid
               | detection by tools designed to detect ChatGPT output.
               | 
               | The whole silly concept of an "AI detector" is a subset
               | of an even sillier one: the notion that human creative
               | output is somehow unique and inimitable.
        
               | tyingq wrote:
               | > _" should NOT be used to accused someone of academic
               | misconduct unless the tool meets a very robust quality
               | standard."_
               | 
               | Meanwhile, the leading commercial tools for plagiarism
               | detection often flag properly cited/annotated quotes from
               | sources in your text as plagiarism.
        
               | mananaysiempre wrote:
               | That sounds like a less serious problem--if the tool
               | highlights the allegedly plagarized sections, at worst
               | the author can conclusively prove it false with no
               | additional research (though that burden should instead be
               | on the tool's user, of course). So it's at least
               | _possible_ to use the tool to get meaningful results.
               | 
               | On the other hand, an opaque LLM detector that just
               | prints "that was from an LLM, methinks" (and not e.g. a
               | prompt and a seed that makes ChatGPT print its input)
               | essentially _cannot_ be proven false by an author who
               | hasn't taken special precautions against being falsely
               | accused, so the bar for sanctioning people based on its
               | output must be much higher (infinitely so as far as I am
               | concerned).
        
               | dontreact wrote:
               | If you were trying to predict the direction a stock will
               | move (up or down) and it was right 99.9% of the time,
               | would you use it or not?
        
               | a13o wrote:
               | This is a strawman. First, the AI detection algorithms
               | can't offer anything close to 99.9%. Second, your
               | scenario doesn't analyze another human and issue
               | judgement, as the AI detection algorithms do.
               | 
               | When a human is miscategorized as a bot, they could find
               | themselves in front of academic fraud boards, skipped
               | over by recruiters, placed in the spam folder, etc.
        
               | dontreact wrote:
               | It's not a strawman. There are many fundamentally
               | unpredictable things where we can't make the benchmark be
               | 100% accuracy.
               | 
               | To make it more concrete on work I am very familiar with:
               | breast cancer screening. If you had a model that
               | outperformed human radiologists at predicting whether
               | there is pathology confirmed cancer within 1 year, but
               | the accuracy was not 100%, would you want to use that
               | model or not?
        
               | frumper wrote:
               | It's a strawman because they aren't comparable to AI
               | detection tests. A screening coming back as possible
               | cancer will lead to follow up tests to confirm, or rule
               | out. An AI detection test coming back as positive can't
               | be refuted or further tested with any level of accuracy.
               | It's a completely unverifiable test with a low accuracy.
        
               | dontreact wrote:
               | You are moving the goalposts here. The original claim I
               | am responding to is "A tool that gives incorrect and
               | inconsistent results shouldn't have any part of a
               | decision making process."
               | 
               | I agree that there are places where we shouldn't put AI
               | and that checking whether something is an LLM or not is
               | one of them. However I think the sentence above takes it
               | way too far and breast cancer screening is a pretty clear
               | example of somewhere we should accept AI even if it can
               | sometimes make mistakes.
        
               | frumper wrote:
               | The thread is about tools to evaluate LLMs. Please re-
               | read my comment in that light and generously assume I'm
               | talking about that.
        
               | hn_go_brrrrr wrote:
               | This is an unreasonable standard. Outside of trivial
               | situations, there are no infallible tools.
        
               | frumper wrote:
               | You're right. After reading what I'd wrote, there should
               | be some reasonable expectations about a tool, such as how
               | accurate it is, or what are the consequences to be wrong.
               | 
               | The AI detection tool fails both as it has a low accuracy
               | and could ruin someones reputation and livelihood. If a
               | tool like this helped you pick out what color socks
               | you're wearing, then it's just as good as asking a magic
               | 8-ball if you should wear the green socks.
        
             | ImprobableTruth wrote:
             | Flagged "human" but actually "LLM" is not a false positive,
             | but a false negative.
        
               | WillPostForFood wrote:
               | It depends how the question is framed: are you asking to
               | confirm humanity, or confirm LLM.
               | 
               | If you are asking, is this LLM text Human generated, and
               | it says Human (yes), then it is false positive.
               | 
               | If you are asking is this LLM generated text LLM
               | generated, and is says and it says Human (no), then it is
               | a false negative.
        
         | xattt wrote:
         | I see no reason why watermarking can't be broken by having
         | someone simply rephrase/redraw the output.
         | 
         | Yes, it's still work, but it's one step removed from having to
         | think up of the original content.
        
           | whimsicalism wrote:
           | Watermarking was never going to be successful except for the
           | most naive uses.
        
             | SkyPuncher wrote:
             | It can likely work in images where you can make subtle,
             | human-undetectable tweaks across thousands/millions of
             | pixels, each with many possible values.
             | 
             | Nearly impossible across data with a couple hundred
             | characters and dozens to thousands of tokens.
        
               | whimsicalism wrote:
               | right but the non-naive approach would be to add noise or
               | have a dumber model rewrite the image. agreed it is
               | easier with images though
        
         | specproc wrote:
         | I'm still interested in this line of enquiry.
         | 
         | These models are clearly not good enough for decision-making,
         | but still might tell an interesting story.
         | 
         | Here's an easily testable exercise: get a load of news from
         | somewhere like newsapi.ai, run it through an open model and
         | there should be a clear discontinuity around ChatGPT launch.
         | 
         | We can assume false positives and false negatives, but with a
         | fat wadge of data we should still be able to discern trends.
         | 
         | Certainly couldn't accuse a student of cheating with it, but
         | maybe spot content farms.
        
         | BestGuess wrote:
         | Taking away tools don't seem to me like the best response same
         | way taking away things tends never to be. If the problem is
         | people not using it right, that seems to me like it would be
         | designed wrong for what people need it for. Like if the issue
         | is using it wrong with too little sentences, then put a minimum
         | sentence or something to have that minimum likelihood.
         | 
         | Same goes for representing what it means. If people don't
         | understand statistics or math and such, then show what it means
         | with circles or coins or stuff like that. Point is don't seem
         | ever a good thing for options to get removed, especially if
         | it's for bein cynical and judgin people like they're beneath
         | deservin it. Don't make no sense.
        
           | insanitybit wrote:
           | The problem isn't people not using it right, the problem is
           | that the tool can never work and just by being out in the
           | world it would cause harm.
           | 
           | If I have a tool that returns a random number between 0 and
           | 1, indicating confidence that text is AI generated, is that
           | tool good? Is it ethical to release it? I'd say no, it isn't.
           | Removing the option is far better because the tool itself is
           | harmful.
        
             | BestGuess wrote:
             | I don't agree with that premise. I don't know that it _can
             | 't_ work, that'd suggest something like no matter what it's
             | worse than a coin flip. I don't think it's that bad or at
             | least nobody showed me anything of it being that bad. You'd
             | have to show me that it can't work and that seems to me a
             | pretty big ask I know
        
               | insanitybit wrote:
               | All that has to be shown is that the tool is as bad as or
               | worse than random _today_ , in order to remove it today.
        
               | BestGuess wrote:
               | From the article, "while incorrectly labeling the human-
               | written text as AI-written 9% of the time."
               | 
               | Seems like from what the article we're talkin about says
               | it definitely ain't worse than random by far. Thing you
               | most want to avoid is wrongly labeling humans as AI-
               | written so that seems pretty good. Though it only
               | identified 26% of AI text as "likely AI-written" that's
               | still better than nothing, and better than random. But we
               | don't know or I don't know from the article if that's on
               | the problem cases of less than 1,000 characters or not.
               | It don't say what the *best case* is just what the
               | general cases are.
               | 
               | Anyhow don't seem to me worse than random is the issue
               | here
        
               | insanitybit wrote:
               | You're right, I should have been less specific. If the
               | harm of false positives is significant you may not need
               | to have random or worse than random results to feel
               | obligated to stop the project.
        
               | BestGuess wrote:
               | alright. thanks for your thoughts
        
               | RugnirViking wrote:
               | I'd want to see a lot better than "better than random"
               | for the type of tool which is already being used to
               | discipline students for academic misconduct, making
               | hiring and firing decisions over who used AI in what
               | CV/job tasks, and generally used to check if someone
               | decieved others by passing off ai writing as their own, a
               | wrong result can impugn people's reputations
        
               | BestGuess wrote:
               | Wherever you draw the line someone's going to be upset at
               | where the line is. You're echoing the other guy's
               | concern, really everyone's concern. Same issue with
               | everything from criminal justice to government all around
               | so there's not really any value in yelling personal
               | preferences at one another, even assumin I disagree which
               | I don't. That ain't what I'm about in either case and it
               | don't change what I said about removing options by
               | assuming people suck being a bad way to go about doing
               | anything.
               | 
               | Might as well remove all comment sections because people
               | suck so assume there's no value having one. Pick any
               | number of things like that. Just ain't a good way to go
               | thinking about anything let alone defending a company for
               | removing it, since the same logic justifies removing your
               | ability to criticize or defend it in the first place. You
               | an AI expert? Assume no, so why we let you talk about it?
               | Or me? People suck so why let you comment? On and on like
               | that.
        
           | cjbgkagh wrote:
           | There are numerous people that I've tried to get them
           | comprehend statistics, important medical statistics for
           | doctors so you would assume they're smart enough to
           | understand. There just seems to be a sufficient subset of the
           | population that are blind to statistics and nothing can be
           | done about it. Even sitting down and carefully going through
           | the math with them doesn't work. No matter how deep into
           | visualization rabbit hole you go there will still be a subset
           | that will not get it.
        
             | BestGuess wrote:
             | Alright let's say that's how it is. How happy would
             | everyone else be if they were treated like that even if
             | they weren't like that? I'd be right miffed and I ain't no
             | einstein. My problem is saying it's a good thing to
             | *remove* options just because some people don't know how to
             | use it. Use that kinda logic for other stuff and you'd
             | paint yourself in a corner with a very angry hornet trapped
             | in it, so not the kind of thing you want to encourage if
             | you assume you'd end up the one trapped. I don't know if my
             | message is comin across right do you get me?
        
               | cjbgkagh wrote:
               | What about the patients getting unnecessary treatments?
               | How upset should they be? What about the student expelled
               | for AI plagiarism due to a false reading? These things
               | are unreliable, and despite an infinite amount of caveats
               | there is no way to prevent people from over relying on
               | it. We might as well dunk people in the water to see if
               | they float.
               | 
               | That's a weird kind of extortion, a demand that we
               | placate a subset of the population to the detriment of
               | others. If a conflict came down to people who understand
               | stats versus those blind to it I would put my money on
               | those who understand stats.
        
               | BestGuess wrote:
               | I don't see how that's any different from anything, any
               | tool, any power, any method. Same problem with
               | everything. That's why this don't convince me and just
               | seems like removing things cynically instead of improving
               | it. Seems to me like the company also really don't want
               | its service identified negatively like that and get
               | itself associated with cheaters even if they're the ones
               | selling the cheat identifying, or something like that.
        
               | cjbgkagh wrote:
               | Firstly, this tool cannot be made better than it is due
               | to the nature of its construction, it is completely
               | intrinsic. Secondly, as LLM models improve, as they are
               | guaranteed to do, this tool can only become worse as it
               | becomes increasingly difficult to distinguish between
               | human and AI written text.
        
               | BestGuess wrote:
               | I don't know about neither of those. How is it intrinsic?
               | What stops detection improving just because AI gets
               | better? Assuming it just doesn't become sentient human
               | replica or something I mean AI like this where it's just
               | a language model thing. Plus that's assuming future stuff
               | you can track in the meanwhile and still don't justify
               | "remove it because people dumb and do bad stuff with
               | tool", that'd only justify removing it later as they do
               | get better.
        
               | cjbgkagh wrote:
               | The algorithms are trained on minimizing the difference
               | between what the algorithm produces and what a human
               | produces. The better the algorithms the less the
               | difference. The algorithms are at the point where there
               | is very little difference and it won't be long until
               | there is no difference.
        
           | RandomLensman wrote:
           | I think it will be increasingly irrelevant what specific
           | process generated a text, for example. Already before genAI
           | people did not in general query into how politicians'
           | speeches were crafted etc.
        
             | arcticbull wrote:
             | Indeed or whether math was done in your head, on a
             | calculator or by a computer. Math is math and the agent
             | that represents the result gets the credit and blame.
        
             | BestGuess wrote:
             | cool beans. I didn't think about it like that. Could be.
        
         | andy99 wrote:
         | > The amount of people in the ecosystem who thinks it's even
         | possible to detect if something is AI written or not when it's
         | just a couple of sentences is staggering high.
         | 
         | I saw that this report came out today which frankly is
         | baffling: https://gpai.ai/projects/responsible-ai/social-media-
         | governa... (Foundation AI Models Need Detection Mechanisms as a
         | Condition of Release [pdf])
        
         | amelius wrote:
         | They could certainly keep a database of things generated by
         | /their/ AI ...
        
           | gmerc wrote:
           | Which would be trivially broken with emojis injection or
           | viewpoint shifting.
        
           | [deleted]
        
       | atleastoptimal wrote:
       | Even if AI detectors were 99% effective, anyone could just
       | iterate over an AI produced piece of writing until it's in the 1%
       | that isn't detected and submit it.
        
       | rhyme-boss wrote:
       | This should have been rejected as an idea just on its face. False
       | positives are really problematic. And if it performs unexpectedly
       | well (accuracy is high) then it just becomes a training tool for
       | reinforcement learning.
        
       ___________________________________________________________________
       (page generated 2023-07-25 23:01 UTC)