[HN Gopher] Spotting LLMs with Binoculars: Zero-Shot Detection o...
       ___________________________________________________________________
        
       Spotting LLMs with Binoculars: Zero-Shot Detection of Machine-
       Generated Text
        
       Author : victormustar
       Score  : 54 points
       Date   : 2024-01-23 20:29 UTC (2 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | ISL wrote:
       | Is a layman's interpretation of this to state: LLMs tend to
       | perform like aggregated humanity, but any given human will
       | differ. Since all the volume of a high-dimensional sphere is at
       | the edge, almost nobody is like the mean, so the false-positive-
       | rate is low?
       | 
       | It's a clever plan, until the LLMs do some adversarial
       | training....
        
         | adamgordonbell wrote:
         | This is a super clear explanation!
         | 
         | Perhaps this measurement approximates a human reaction to
         | chatGPT: 'This writing is distinctly indistinct.'
        
           | actionfromafar wrote:
           | Made me think of how the bomber planes pilot seats were
           | designed to the average human, which meant it fit no human.
        
             | sdsaga12 wrote:
             | Sounds like the bed of Procrustes:
             | https://en.wikipedia.org/wiki/Procrustes#Mythology
        
         | __loam wrote:
         | Yeah it kind of sucks for people who don't like these systems
         | that efforts to resist them are essentially the same as using
         | GANs to train them.
        
           | shagie wrote:
           | https://www.smithsonianmag.com/history/what-the-luddites-
           | rea...
           | 
           | > They did not invent a machine to destroy technology, but
           | they knew how to use one. In Yorkshire, they attacked frames
           | with massive sledgehammers they called "Great Enoch," after a
           | local blacksmith who had manufactured both the hammers and
           | many of the machines they intended to destroy. "Enoch made
           | them," they declared, "Enoch shall break them."
           | 
           | ... And another reference for the phrase...
           | 
           | https://www.nigeltyas.co.uk/nigel-tyas-news/post/enoch-
           | the-p...
           | 
           | > And here's the funny thing. The weapons they reached for to
           | wield and smash the machines were sledge hammers made by ...
           | the Taylor brothers of Marsden. This irony was not lost on
           | the Luddites and as they swung 'Enoch's hammers' to damage
           | his hated machines they cried: "Enoch made them, and Enoch
           | shall break them".
        
         | lawlessone wrote:
         | >It's a clever plan, until the LLMs do some adversarial
         | training....
         | 
         | it's an unwinnable war.
        
       | jjackson5324 wrote:
       | > false positive rate of 0.01%
       | 
       | What would be an acceptable false positive rate for something
       | like this to be used at schools and universities?
       | 
       | Like, obviously 0.01% is not acceptable, but what would be?
        
         | Etheryte wrote:
         | Why do you think this rate is not acceptable? I would say it's
         | more than acceptable, even as a single data point. If someone's
         | submission comes up as positive on two separate occasions
         | you've pretty much eliminated the chance of a false positive.
        
           | declaredapple wrote:
           | I would agree with that it's usable as a single datapoint,
           | used with others (other submissions, general performance,
           | etc).
           | 
           | However given that we already have professors literally
           | failing people by just pasting and asking chatgpt, I'm not
           | sure I'm comfortable with that.
        
           | FLT8 wrote:
           | Unless that person's writing style just happens to be very
           | close to how the average human writes? If it's 0.01% for "any
           | given human", I wonder what the numbers would look like for a
           | human closer to average than usual?
        
           | wrs wrote:
           | Only if the measures for an individual are uncorrelated,
           | which there's no reason to assume.
           | 
           | It needs to be <= 0.01% false positive _for each individual
           | author_. If it's just that 0.01% of _all_ tests are false
           | positive, that leaves the possibility than a given individual
           | might have anywhere up to 100% false positives.
        
           | Lerc wrote:
           | I feel like 0.01% is one of the more dangerous levels of
           | false positives. Widespread use would result in far more than
           | 10,000 tests resulting in the near certainty that innocent
           | people would be accused. The moderately low false positive
           | rate would then be leaned on to imply guilt.
           | 
           | You might find for every 10,000 tests you get around 151
           | targets. For each one individually you can say they probably
           | did it, but cumulatively you can say that one of them is
           | probably innocent.
           | 
           | Consider using two positives. Do schools and universities
           | generate 100,000,000 essays per year? Sure would suck for
           | that innocent person tagged as having a one in a hundred
           | million chance of not being guilty.
        
         | yreg wrote:
         | 0.01% sounds far better than what I would expect!
         | 
         | If the tech froze at its current state, this would be useful
         | for schools. You don't need to expel a student right away after
         | finding a match, but it is a strong indication that something
         | is worth looking into.
         | 
         | (If the goal is to make students write essays, theses, etc.
         | without an LLM writing it for them.)
        
           | shagie wrote:
           | > (If the goal is to make students write essays, theses, etc.
           | without an LLM writing it for them.)
           | 
           | This is one of the "I don't think that this is the path that
           | we should be taking."
           | 
           | When I was in school, my parents would proof read the essays
           | to catch the spelling and grammatical errors that were in
           | what I wrote (Bank Street Writer had a rudimentary spelling
           | checker but that was it -
           | https://en.wikipedia.org/wiki/Bank_Street_Writer ).
           | 
           | While my parents are both native English speakers and college
           | educated, some of my classmates had less involved parents, or
           | parents that didn't have the same degree of proficiency for
           | writing. Did their essays suffer from a lack of parental
           | proof reading?
           | 
           | In the past few months I wrote two short works of fiction as
           | lore for a game that I play. I used ChatGPT to act as an
           | editor for those works looking at it and occasionally
           | prompting it to help refine a passage.
           | 
           | https://chat.openai.com/share/204de7f7-9cd7-4c45-aa2b-556791.
           | .. for part of the editor session with it.
           | 
           | Having ChatGPT act as an editor (not text editor but as a
           | critique of the text) helped refine the text that I wrote.
           | 
           | Working _with_ ChatGPT as a tool (that is far beyond the red
           | squiggles in a word processor) to help people working with
           | the written word is a good and useful endeavor. This isn 't
           | trying to have ChatGPT supplant human creativity but rather
           | help the person communicate more clearly.
           | 
           | ---
           | 
           | I am leaving this in an unedited form, but here is _this_
           | post with ChatGPT as an editor as an example of how I believe
           | students should try to interact with it. https://chat.openai.
           | com/share/d891f9ac-923b-47a8-8de9-ab7301...
        
             | yreg wrote:
             | Yeah, at some point it's sensible to learn using all the
             | available tools. Use calculators in math class, the web
             | while programming, etc.
             | 
             | Still, there is a reason why calculators are not used since
             | the very first grade -> it makes sense to learn how to do
             | basic calculations without them.
        
       | vicgalle_ wrote:
       | > https://twitter.com/minimaxir/status/1749893683137454194
       | 
       | Not very promising, though.
        
         | yreg wrote:
         | Well that is not a novel text, is it?
        
         | comex wrote:
         | The paper itself goes into detail about why the US Constitution
         | and other memorized texts are misclassified. It's surprising
         | but not a killer flaw, since in most contexts it would only
         | apply to direct quotes of famous texts.
        
       | TuringNYC wrote:
       | The false positive rate would kill most use cases here. Even
       | 1/10000 false accusations of academic integrity would be too
       | much.
        
         | berkes wrote:
         | Is it bad if it's an accusation?
         | 
         | Wouldn't it need additional data, such as actual proof, to
         | become an allegation or even a claim or charge?
        
           | lawlessone wrote:
           | >Wouldn't it need additional data, such as actual proof, to
           | become an allegation or even a claim or charge?
           | 
           | Yes but that hasn't stopped people before.
        
         | JoeJonathan wrote:
         | Do you really think professors flag plagiarism only when it's
         | cut and dry? Absolutely not. Plenty, if not most, flagged cases
         | of plagiarism are ambiguous. The process at most colleges and
         | universities typically accounts for this via something like
         | review by an academic integrity committee.
        
       | binsquare wrote:
       | I'm not convinced that we're on the right path in detecting ai
       | generated content.
       | 
       | We've been looking at the end result and making conclusions about
       | the journey - and that will always comes with degrees of
       | uncertainty. A false positive rate of 0.01% now probably will not
       | be applicable as people adapt and grow alongside ai content.
       | 
       | I wonder if anyone's working on software that documents the
       | journey of the output similar to like git commits, such that we
       | can analyze both the metadata (journey) & output (end result) to
       | determine human authenticity.
        
         | tomaskafka wrote:
         | Edit history is an awesome learning data (as they show the
         | train of thought), but I can imagine that models will easily
         | learn to generate it.
        
       | Imnimo wrote:
       | According to their demo, their Limitations section of their
       | github repo is AI-generated.
       | 
       | >All AI-generated text detectors aim for accuracy, but none are
       | perfect and can have multiple failure modes (e.g., Binoculars is
       | more proficient in detecting English language text compared to
       | other languages). This implementation is for academic purposes
       | only and should not be considered as a consumer product. We also
       | strongly caution against using Binoculars (or any detector)
       | without human supervision.
        
       | etwigg wrote:
       | If the text is good, and someday it will be, I don't care if an
       | LLM wrote it. If it's bad, I don't care if a person wrote it.
       | 
       | The only reason to care is that the implicit proof-of-work signal
       | has broken because LLM text is so cheap. Open forums might need
       | to be pay-per-submission someday...
        
         | Jedd wrote:
         | The problem as I see it is that 'text is good' has two distinct
         | meanings - first, it doesn't sound like an AI wrote it, it's
         | interesting, entertaining, in word 'readable'. The second is
         | that it's accurate / true.
         | 
         | It feels like we're happy to take the first as a surrogate for
         | the second, or at least being good at the first drops our guard
         | on questioning the second.
        
       | Jedd wrote:
       | https://huggingface.co/spaces/tomg-group-umd/Binoculars
        
       ___________________________________________________________________
       (page generated 2024-01-23 23:00 UTC)