hngopher.com

       [HN Gopher] How we're responding to The NYT's data demands in or...
       ___________________________________________________________________
        
       How we're responding to The NYT's data demands in order to protect
       user privacy
        
       Author : BUFU
       Score  : 264 points
       Date   : 2025-06-06 00:35 UTC (22 hours ago)
        
 (HTM) web link (openai.com)
 (TXT) w3m dump (openai.com)
        
       | supriyo-biswas wrote:
       | I wonder whether OpenAI legal can make the case for storing fuzzy
       | hashes of the content, in the form of ssdeep[1] hashes or
       | content-defined chunks[2] of said data, instead of the actual
       | conversations themselves.
       | 
       | After all, since the NYT has a very limited corpus of
       | information, and supposedly people are generating infringing
       | content using their APIs, said hashes can be used to compare
       | whether such content has been generated.
       | 
       | I'd rather have them store nothing, but given the overly broad
       | court order I think this may be the best middle ground. Of
       | course, I haven't read the lawsuit documents and don't know if
       | NYT is requesting far more, or alleging some indirect form of
       | infringement which would invalidate my proposal.
       | 
       | [1] https://ssdeep-project.github.io/ssdeep/index.html
       | 
       | [2] https://joshleeb.com/posts/content-defined-chunking.html
        
         | paxys wrote:
         | Yeah, try explaining any of these words to a lawyer or judge.
        
           | m463 wrote:
           | "you are a helpful law assistant."
        
           | fc417fc802 wrote:
           | I thought that's what GPT was for.
        
           | landl0rd wrote:
           | "You are a long-suffering clerk speaking to a judge who's sat
           | the same federal bench for two decades and who believes
           | 'everything is computer' constitutes a deep technical
           | insight."
        
           | sthatipamala wrote:
           | The judges in these technical cases can be quite
           | sophisticated and absolutely do learn terms of art. See
           | Oracle v. Google (Java API case)
        
             | anshumankmr wrote:
             | As I looked up the judge for this
             | one(https://en.wikipedia.org/wiki/William_Alsup) who was a
             | hobbyist basic programmer, one would need a judge who coded
             | MNIST as a passtime hobby if that is the case.
        
               | king_magic wrote:
               | a smart judge who is minimally tech savvy could learn to
               | train a model to predict MNIST in a day or two
        
         | bigyabai wrote:
         | All of that _does_ fit on a real spiffy whitepaper. Let 's not
         | fool around though, every ChatGPT session is sent directly into
         | an S3 bucket that some three-letter spook backs up onto their
         | tapes every month. It's a database of candid, timestamped text
         | interactions from a bunch of rubes that logged in with their
         | Google account - you couldn't ask for a juicer target unless
         | you reinvented email. _Of course_ it 's backdoored, you can't
         | even begin to try proving me wrong.
         | 
         | Maybe I'm alone, but a pinkie-promise from Sam Altman does not
         | confer any assurances about my data to me. It's about equally
         | as reassuring as a singing telegram from Mark Zuckerberg
         | dancing to a song about how secure WhatsApp is.
        
           | landl0rd wrote:
           | Of course I can't even begin trying to prove you wrong.
           | You're making an unfalsifiable statement. You're pointing to
           | the Russel's Teapot of sigint.
           | 
           | It's well-established that the American IC, primarily NSA,
           | collects a lot of metadata about internet traffic. There are
           | some justifications for this and it's less bad in the age of
           | ubiquitous TLS, but it generally sucks. However, legal
           | protections against directly spying on the actual decrypted
           | content of Americans are at least in theory stronger.
           | 
           | Snowden's leaks mentioned the NSA tapping inter-DC links of
           | Google and Yahoo, so I doubt if they had to tap links that
           | there's a ton of voluntary cooperation.
           | 
           | I'd also point out that trying to parse the unabridged
           | prodigious output of the SlopGenerator9000 is a really hard
           | task unless you also use LLMs to do it.
        
             | cwillu wrote:
             | > I'd also point out that trying to parse the unabridged
             | prodigious output of the SlopGenerator9000 is a really hard
             | task unless you also use LLMs to do it.
             | 
             | The input is what's interesting.
        
               | Aeolun wrote:
               | It doesn't change the monumental scope of the problem
               | though.
               | 
               | Though I'm inclined to believe the US gov can if OpenAI
               | can.
        
             | tdeck wrote:
             | > Snowden's leaks mentioned the NSA tapping inter-DC links
             | of Google and Yahoo, so I doubt if they had to tap links
             | that there's a ton of voluntary cooperation.
             | 
             | The laws have changed since then and it's not for the
             | better:
             | 
             | https://www.aclu.org/press-releases/congress-passing-bill-
             | th...
        
               | tuckerman wrote:
               | Even if the laws give them this power, I believe it would
               | be extremely difficult for an operation like this to go
               | unnoticed (and therefore unreported) at most of these
               | companies. MUSCULAR [1] was able to be pulled off because
               | of the cleartext inter-datacenter traffic which was
               | subsequently encrypted. It's hard to see how they could
               | pull off a similar operation without the cooperation of
               | Google which would also entail a tremendous internal
               | cover up.
               | 
               | [1] https://en.wikipedia.org/wiki/MUSCULAR
        
               | onli wrote:
               | Warrantlessly installed backdoors in the log system
               | combined with a gag order, combined with secret courts,
               | all "perfectly legal". Not really hard to imagine.
        
               | tuckerman wrote:
               | You would have to gag a huge chunk of the engineers and I
               | just don't think that would work without leaks. Google's
               | infrastructure would not make something like that easy to
               | do clandestinely (trying to avoid saying impossible but
               | it gets close).
               | 
               | I was an SRE and SWE on technical infra at Google,
               | specifically the logging infrastructure. I am under no
               | gag order.
        
             | komali2 wrote:
             | There's no way to know, but it's safer to assume.
        
             | zer00eyz wrote:
             | > However, legal protections against directly spying on the
             | actual decrypted content of Americans are at least in
             | theory stronger.
             | 
             | This was the point of the lots of the five eyes programs.
             | Its not legal for the US to spy on its own citizens, but it
             | isnt against the law for us to do to the Australians... Who
             | are all to happy to reciprocate.
             | 
             | > Snowden's leaks mentioned the NSA tapping inter-DC links
             | of Google and Yahoo...
             | 
             | Snowden's info wasn't really news for many of us who were
             | paying attention in the aftermath of 9/11:
             | https://en.wikipedia.org/wiki/Room_641A (This was huge on
             | slashdot at the time... )
        
             | dmurray wrote:
             | > You're pointing to the Russel's Teapot of sigint.
             | 
             | If there were multiple agencies with billion dollar budgets
             | and a belief that they had an absolute national security
             | mandate to get a teapot into solar orbit, and to lie about
             | it, I would believe there was enough porcelain up there to
             | make a second asteroid belt.
        
             | rl3 wrote:
             | > _However, legal protections against directly spying on
             | the actual decrypted content of Americans are at least in
             | theory stronger._
             | 
             | Yeah, because the definition of collection was redefined to
             | mean accessing the full content already stored on their
             | systems, post-interception. It wasn't considered collected
             | until an analyst views it. Metadata was a laughable dog and
             | pony show that was part of the same legal shell games at
             | the time, over a decade ago now.
             | 
             | That said, from an outsider's perspective it sounded like
             | the IC did collectively erect robust guard rails such that
             | access to information was generally controlled and audited.
             | I felt like this broke down a bit once sharing 702 data
             | with other federal agencies was expanded around the same
             | time period.
             | 
             | These days, those guard rails might be the only thing
             | standing in the way of democracy as we know it ending in
             | the US. AI processing applied to full-take collection is
             | terrifying, just ask the Chinese.
        
             | Yizahi wrote:
             | Metadata is spying (c) Bruce Schneier
             | 
             | If a CIA spook is stalking you everywhere, documenting your
             | every visible move or interaction, you probably would call
             | that spying. Same applies to digital.
             | 
             | Also, teapot argument can be applied in reverse. We have
             | all these documented open digital network systems
             | everywhere, and you want to say that one the most
             | unprofitable and certainly the most expensive to run system
             | is somehow protecting all user data? That belief is based
             | on what? At least selling data is based on evidence of the
             | industry and on actual ToS'es of other similar corpos.
        
               | jstanley wrote:
               | The comment you replied to isn't saying that metadata
               | isn't spying. It's saying that the spies generally don't
               | have free access to content data.
        
             | Workaccount2 wrote:
             | My choice conspiracy is that the three letter agencies
             | actively support their omnipresent, omniknowing
             | conspiracies because it ultimately plays into their hand.
             | Sorta like a Santa Claus for citizens.
        
               | bigyabai wrote:
               | > because it ultimately plays into their hand.
               | 
               | How? Scared criminals aren't going to make themselves
               | easy to find. Three-letter spooks would almost certainly
               | prefer to smoke-test a docile population than a paranoid
               | one.
               | 
               | In fact, it kinda overwhelmingly seems like _the
               | opposite_ happens. Remember the 2015 San-Bernadino
               | shooting that was pushed into the national news for no
               | reason? Remember how the FBI bloviated about how _hard_
               | it was to get information from an iPhone, 3 years after
               | Tim Cook 's assent to the PRISM program?
               | 
               | Stuff like this is almost certainly theater. If OpenAI
               | perceived retention as a life-or-death issue, they would
               | be screaming about this case from the top of their lungs.
               | If the FBI percieved it as a life-or-death issue, we
               | would never hear about it in our lifetimes. The dramatic
               | and protracted public fights suggest to me that OpenAI
               | simply wants an alibi. Some sort of user-story that
               | _smells_ like secure and private technology, but in
               | actuality is very obviously neither.
        
           | farts_mckensy wrote:
           | Think of all the complete garbage interactions you'd have to
           | sift through to find anything useful from a national security
           | standpoint. The data is practically obfuscated by virtue of
           | its banality.
        
             | artursapek wrote:
             | I've done my part cluttering it with my requests for the
             | same banana bread recipe like 5 separate times.
        
               | refuser wrote:
               | It was that good?
        
               | baobun wrote:
               | gief
        
             | brigandish wrote:
             | Search engines have been doing this since the mid 90s and
             | have only improved, to think that any data is _obfuscated_
             | by its being part of some huge volume of other data is a
             | fallacy at best.
        
               | farts_mckensy wrote:
               | Search engines use our data for completely different
               | purposes.
        
               | yunwal wrote:
               | That doesn't negate the GPs point. It's easy to make
               | datasets searchable.
        
               | farts_mckensy wrote:
               | Searchable? You have to know what to search for, and you
               | have to rule out false positives. How do you discern a
               | person roleplaying some secret agent scenario vs. a
               | person actually plotting something? That's not something
               | a search function can distinguish. It requires a human to
               | sift through that data.
        
             | bigyabai wrote:
             | "We kill people based on metadata." - National Security
             | Agency Gen. Michael Hayden
             | 
             | Raw data with time-series significance is their absolute
             | favorite. You might argue something like Google Maps data
             | is "obfuscated by virtue of its banality" until you catch
             | the right person in the wrong place. ChatGPT sessions are
             | the same way, and it's going to be fed into aggregate
             | surveillance systems in the way modern telecom and
             | advertiser data is.
        
               | farts_mckensy wrote:
               | This is mostly security theater, and generally not worth
               | the lift when you consider the steps needed to unlock the
               | value of that data in the context of investigations.
        
               | bigyabai wrote:
               | Citation?
        
               | farts_mckensy wrote:
               | -The Privacy and Civil Liberties Oversight Board's 2014
               | review of the NSA "Section 215" phone-record program
               | found no instance in which the dragnet produced a
               | counter-terror lead that couldn't have been obtained with
               | targeted subpoenas. https://en.m.wikipedia.org/wiki/Priva
               | cy_and_Civil_Liberties_...
               | 
               | -After Boston, Paris, Manchester, and other attacks,
               | post-mortems showed the perpetrators were already in
               | government databases. Analysts simply didn't connect the
               | dots amid the flood of benign hits.
               | https://www.newyorker.com/magazine/2015/01/26/whole-
               | haystack
               | 
               | -Independent tallies suggest dozens of civilians killed
               | for every intended high-value target in Yemen and
               | Pakistan, largely because metadata mis-identifies phones
               | that change pockets. https://committees.parliament.uk/wri
               | ttenevidence/36962/pdf
        
           | 7speter wrote:
           | Maybe I'm wrong, and maybe this was discussed previously, but
           | of course openai keeps our data, they use it for training!
        
             | nl wrote:
             | As the linked page points out you can turn this off in
             | settings if you are an end user or choose zero retention if
             | you are an API user.
        
               | justacrow wrote:
               | I mean, they already stole and used all copyrighted
               | material they could find to train the thing, am I
               | supposed to believe that thry wont use my data just
               | because I tick a checkbox?
        
               | stock_toaster wrote:
               | Agreed, I have hard time believing anything the eye
               | scanning crypto coin (worldcoin or whatever) guy says at
               | this point.
        
               | Jackpillar wrote:
               | I wish I could test drive your brain to experience a
               | world where one believes that would stop them from
               | stealing your data.
        
           | rl3 wrote:
           | _> Of course it's backdoored, you can't even begin to try
           | proving me wrong._
           | 
           | On the contrary.
           | 
           |  _> Maybe I'm alone, but a pinkie-promise from Sam Altman
           | does not confer any assurances about my data to me._
           | 
           | I think you're being unduly paranoid. /s
           | 
           | https://www.theverge.com/2024/6/13/24178079/openai-board-
           | pau...
           | 
           | https://www.wsj.com/tech/ai/the-real-story-behind-sam-
           | altman...
        
         | LandoCalrissian wrote:
         | Trying to actively circumvent the intention of a judges order
         | is a pretty bad idea.
        
           | girvo wrote:
           | Deeply, deeply so. In fact so much so that people who suggest
           | them show they've (luckily) not had to interact with the
           | legal system much. Judges take an incredibly dim view of that
           | kind of thing haha
        
           | Aeolun wrote:
           | That's not circumvention though. The intent of the order is
           | to be able to prove that ChatGPT regurgitates NYT content,
           | not to read the personal communications of all ChatGPT users.
        
         | delusional wrote:
         | I haven't been able to find any of the supporting documents,
         | but the court order makes it seem like OpenAI has been
         | unhelpful in producing any alternative during the conversation.
         | 
         | For example, the judge seems to have asked if it would be
         | possible to segregate data that the users wanted deleted from
         | other data, but OpenAI has failed to answer. Not just denied
         | the request, but simply ignored it.
         | 
         | I think it's quite likely that OpenAI has taken the PR route
         | instead of seriously engaging with any way to constructively
         | honor the request for retention of data.
        
       | vanattab wrote:
       | Protect our privacy? Or protect thier right to piracy?
        
         | NBJack wrote:
         | Agreed. I don't buy the spin.
        
         | charrondev wrote:
         | I mean the court is ordering them to retain user conversations
         | at least until resolution of the court case (in case there is
         | copyrighted responses being generated?).
         | 
         | So user privacy is definitely implicated.
        
       | amluto wrote:
       | It appears that the "Zero Data Retention" APIs they mention are
       | something that customers need to request access to, and that it's
       | really quite hard to get this access. I'd be more impressed if
       | any API user could use those APIs.
        
         | singron wrote:
         | If OpenAI cared about our privacy, ZDR would be a setting
         | anyone could turn on.
        
         | JimDabell wrote:
         | I believe Apple's agreement includes this, at least when a user
         | isn't signed into an OpenAI account:
         | 
         | > OpenAI must process your request solely for the purpose of
         | fulfilling it and not store your request or any responses it
         | provides unless required under applicable laws. OpenAI also
         | must not use your request to improve or train its models.
         | 
         | -- https://www.apple.com/legal/privacy/data/en/chatgpt-
         | extensio...
         | 
         | I wonder if we'll end up seeing Apple dragged into this
         | lawsuit. I'm sure after telling their users it's private, they
         | won't be happy about everything getting logged, even if they do
         | have that caveat in there about complying with laws.
        
           | fc417fc802 wrote:
           | > I'm sure after telling their users it's private, they won't
           | be happy about everything getting logged,
           | 
           | The ZDR APIs are not and will not be logged. The linked page
           | is clear about that.
        
       | FireBeyond wrote:
       | Sure, OpenAI, I will absolutely trust you.
       | 
       | > The content covered by the court order is stored separately in
       | a secure system. It's protected under legal hold, meaning it
       | can't be accessed or used for purposes other than meeting legal
       | obligations.
       | 
       | That's horse shit and OpenAI knows it. It means no such thing. A
       | legal hold is just a 'preservation order'. It says _absolutely
       | nothing_ about other access or use.
        
         | fragmede wrote:
         | why is it horse shit that OpenAI is saying they've put the
         | files in a cabinet that only legal has access to?
        
           | FireBeyond wrote:
           | They are saying a "legal hold" means that they have to keep
           | the data but don't worry they're not allowed to use it or
           | access it for any other reason.
           | 
           | A legal hold requires no such thing and there would be no
           | such requirement in it. They are perfectly free to access and
           | use it for any reason.
        
         | mmooss wrote:
         | OpenAI's other policies, and other laws and regulations, do
         | have such requirements. Are they nullified because the data is
         | held under a court order?
        
           | mrguyorama wrote:
           | "The judge and court need to view this information to
           | actually pass justice and decide the case" almost always
           | supersedes other laws.
           | 
           | The GDPR does not say that you can never be proven to have
           | done something wrong in a court of law.
        
             | mmooss wrote:
             | Right. The GGP says the information could be used for other
             | purposes.
        
       | tomhow wrote:
       | Related discussion:
       | 
       |  _OpenAI slams court order to save all ChatGPT logs, including
       | deleted chats_ - https://news.ycombinator.com/item?id=44185913 -
       | June 2025 (878 comments)
        
       | dangus wrote:
       | I think the court order doesn't quite go against as many norms as
       | OpenAI is claiming. It's very reasonable to retain data pertinent
       | to a case, and NYT's case almost certainly revolves around
       | finding out copyright infringement damages, which are calculated
       | based on the number of violations (how many users queried ChatGPT
       | and were returned verbatim copyrighted material from NYT).
       | 
       | If you don't retain that data you're destroying evidence for the
       | case.
       | 
       | It's not like the data is going to be given to anyone, it's only
       | gong to be used for limited legal purposes for the lawsuit (as
       | OpenAI confirms in this article).
       | 
       | And honestly, OpenAI should have just not used copyrighted data
       | illegally and they would have never had this problem. I saw NYT's
       | filing and it had very compelling evidence that you could get
       | ChatGPT to distribute verbatim copyrighted text from the Times
       | without citation.
        
         | tptacek wrote:
         | _And honestly, OpenAI should have just not used copyrighted
         | data illegally and they would have never had this problem_
         | 
         | The whole premise of the lawsuit is that they didn't do
         | anything unlawful, so saying "just do what the NYT wanted you
         | to do" isn't interesting.
        
           | dangus wrote:
           | No, you're misinterpreting how information discovery and the
           | court system works.
           | 
           | The NYT made an argument to a judge about what they think is
           | going on and how they think the copyright infringement is
           | taking place and harming them. In their filings and hearings
           | they present the reasoning and evidence they have that leads
           | them to believe that a violation is occurring. The court
           | makes a judgment on whether or not to order OpenAI to
           | preserve and disclose information relevant to the case to the
           | court.
           | 
           | It's not "just do what NYT wanted you to do," it's "do what
           | the court orders you to do based on a lawsuit brought by a
           | plaintiff and argued to the court."
           | 
           | I suggest you read the court filing: https://nytco-
           | assets.nytimes.com/2023/12/NYT_Complaint_Dec20...
        
         | lxgr wrote:
         | It absolutely goes against norms in many countries other than
         | the US, and the data of residents/citizens of these countries
         | are affected too.
         | 
         | > It's not like the data is going to be given to anyone, it's
         | only gong to be used for limited legal purposes for the lawsuit
         | (as OpenAI confirms in this article).
         | 
         | Nobody other than both parties to the case, their lawyers, the
         | court, and whatever case file storage system they use. In my
         | view, that's already way too much given the amount and value of
         | this data.
        
           | dangus wrote:
           | Countries other than the US aren't part of this lawsuit.
           | ChatGPT operates in the US under US law. I don't know if they
           | have separated data storage for other countries.
           | 
           | I don't believe you would be considered to be violating the
           | GDPR if you are complying with another court order, because
           | you are presumably making a best effort to comply with the
           | GDPR besides that court order.
           | 
           | You're saying it's unreasonable to store data somewhere for a
           | pending court case? Conceptually you're saying that you can't
           | preserve data for trials because the filing cabinets might
           | see the information. That's ridiculous, if that was true then
           | it would be impossible to perform discovery and get anything
           | done in court.
        
             | lxgr wrote:
             | > I don't believe you would be considered to be violating
             | the GDPR if you are complying with another court order,
             | because you are presumably making a best effort to comply
             | with the GDPR besides that court order.
             | 
             | It most likely depends on the exact circumstances. I could
             | absolutely imagine a European court deciding that, sorry,
             | but if you have to answer to a court decision incompatible
             | with European privacy laws, you can't offer services to
             | European residents anymore.
             | 
             | > You're saying it's unreasonable to store data somewhere
             | for a pending court case?
             | 
             | I'm saying it can be, depending on how much personal and/or
             | unrelated data gets tangled up in it. That seems to be the
             | case here.
             | 
             | > Conceptually you're saying that you can't preserve data
             | for trials because the filing cabinets might see the
             | information.
             | 
             | I'm only saying that there should be proportionality. A
             | court having access to all facts relevant to a case is
             | important, but it's not the only important thing in the
             | world.
             | 
             | Otherwise, we could easily end up with a Dirk-Gently-esque
             | court that, based on the principle that everything is
             | connected to everything, will just demand access to all the
             | data in the world.
        
               | dangus wrote:
               | The scope of the data access required by the court is
               | being worked out via due process. That's why there's an
               | appeal system. OpenAI is just grandstanding in a public
               | forum so that their customers don't defect.
               | 
               | When it comes to GDPR, courts have generally taken the
               | stance that GDPR is not overruling.
               | 
               | Ironburg Inventions, Ltd. v. Valve Corp.
               | 
               | Finjan, Inc. v. Zscaler, Inc.
               | 
               | Corel Software, LLC v. Microsoft
               | 
               | Rollins Ranches, LLC v. Watson
               | 
               | In none of these cases was a GDPR fine issued.
        
         | danenania wrote:
         | Putting the merits of this specific case and positive vs.
         | negative sentiments toward OpenAI aside, this tactic seems like
         | it can be used to destroy any business or organization with
         | customers who place a high value on privacy--without actually
         | going through due process and winning a lawsuit.
         | 
         | Imagine a lawsuit against Signal that claimed some nefarious
         | activity, harmful to the plaintiff, was occurring broadly in
         | chats. The plaintiff can claim, like NYT, that it might be
         | necessary to examine private chats in the future to make a
         | determination about some aspect of the lawsuit, and the judge
         | can then order Signal to find a way to retain all chats for
         | potential review.
         | 
         | However you feel about OpenAI, this is not a good precedent for
         | user privacy and security.
        
           | fc417fc802 wrote:
           | That's not entirely fair. The argument isn't "users are using
           | the service to break the law" but rather "the service is
           | facilitating law breaking". To fix your signal analogy
           | suppose you could use the chat interface to request
           | copyrighted material from the operator.
        
             | charcircuit wrote:
             | That doesn't change the outcome being the same in that the
             | app has to send the plain text messages of everyone,
             | including the chat history of every user.
        
               | fc417fc802 wrote:
               | Right. But requiring logs due to suspicion that the
               | service itself is actively violating the law is entirely
               | different from doing so on the basis that end users might
               | be up to no good _entirely independently_.
               | 
               | Also OpenAI was never E2EE to begin with. They were
               | already retaining logs for some period of time.
               | 
               | My personal view is that the court order is overly broad
               | and disregards potential impacts on end users but it's
               | nonetheless important to be accurate about what is and
               | isn't happening here.
        
               | dangus wrote:
               | Again keep in mind that we are talking about a case
               | limited analysis of that data within the privacy of the
               | court system.
               | 
               | For example, if the trial happens to find data that some
               | chats include crimes committed by users in their private
               | chats, the court can't just send police to your door
               | based on that information since the information is only
               | being used in the context of an intellectual property
               | lawsuit.
               | 
               | Remember that privacy rights are legitimate rights but
               | they change a lot when you're in the context of an
               | investigation/court proceeding. E.g., the right of police
               | to enter and search your home changes a lot when they get
               | a court issued warrant.
               | 
               | The whole point of E2EE services from the perspective of
               | privacy-concious customers is that a court can get a
               | warrant for data from those companies but they'll only be
               | able to produce encrypted blobs with no access to
               | decryption keys. OpenAI was always a not-E2EE service, so
               | customers have to expect that a court order could surface
               | their data to someone else's eyes at some point.
        
           | dangus wrote:
           | I'm confused at how you think that NYT isn't going through
           | due process and attempting to win a lawsuit.
           | 
           | The court isn't saying "preserve this data forever and ever
           | and compromise everyone's privacy," they're saying "preserve
           | this data for the purposes of this court while we perform an
           | investigation."
           | 
           | IMO, the NYT has a very good argument here that the only way
           | to determine the scope of the copyright infringement is to
           | analyze requests and responses made by every single customer.
           | Like I said in my original comment, the remedies for
           | copyright infringement are on a per-infringement basis. E.g.,
           | everytime someone on LimeWire downloads Song 2 by Blur from
           | your PC, you've committed one instance of copyright
           | infringement. My interpretation is that NYT wants the court
           | to find out how many times customers have received ChatGPT
           | responses that include verbatim New York Times content.
        
       | _jab wrote:
       | > How will you store my data and who can access it?
       | 
       | > The content covered by the court order is stored separately in
       | a secure system. It's protected under legal hold, meaning it
       | can't be accessed or used for purposes other than meeting legal
       | obligations.
       | 
       | > Only a small, audited OpenAI legal and security team would be
       | able to access this data as necessary to comply with our legal
       | obligations.
       | 
       | So, by OpenAI's own admission, they are taking abundant and
       | presumably effective steps to protect user privacy here? In the
       | unlikely event that this data did somehow leak, I'd personally be
       | blaming OpenAI, not the NYT.
       | 
       | Some of the other language in this post, like repeatedly calling
       | the lawsuit "baseless", really makes this just read like an
       | unconvincing attempt at a spin piece. Nothing to see here.
        
         | sashank_1509 wrote:
         | Obviously openAI's point of view will be their point of view.
         | They are going to call this lawsuit baseless, they would not be
         | fighting it or else.
        
           | ivape wrote:
           | To me it's pretty clear the way this will happen. You will
           | need to buy additional credits or subscriptions through these
           | LLMs that feedback payment to things like NYT and book
           | publishers. It's all stolen. I don't even want to hear it.
           | This company doesn't want to pay up and willing to let user's
           | privacy hang in the balance to draw the case out until they
           | get sure footing with their device launches or the like (or
           | additional markets like enterprise, etc).
        
             | fallingknife wrote:
             | Copyright is pretty narrowly tailored to verbatim
             | reproduction of content so I doubt they will have to pay
             | anything.
        
               | tiahura wrote:
               | incorrect. copyright applies to derived works.
        
               | vel0city wrote:
               | Even then, it's possible to prompt the model to exactly
               | reproduce the copyrighted works.
        
               | fallingknife wrote:
               | Please show me one of these prompts
        
               | vel0city wrote:
               | NYT has examples in their legal complaint. See page 30.
               | 
               | https://www.scribd.com/document/695189742/NYT-v-OpenAI
        
             | Workaccount2 wrote:
             | > It's all stolen.
             | 
             | LLMs _are not_ massive archives of data. The big models are
             | a few TB in size. No one is forgoing a NYT subscription
             | because they can ask ChatGPT to print out NYT news stories.
        
               | edbaskerville wrote:
               | Regardless of the representation, some people _are_
               | replacing news consumption generally with answers from
               | ChatGPT.
        
         | tptacek wrote:
         | No, there is a whole news cycle about how chats you delete
         | aren't actually being deleted because of a lawsuit, they
         | essentially have to respond. It's not an attempt to spin the
         | lawsuit; it's about reassuring their customers.
        
           | VanTheBrand wrote:
           | The part where they go out of the way to call the lawsuit
           | baseless is spin though, and mixing that with this messaging
           | does exactly that, presents a mixed message. The NYT lawsuit
           | is objectively not baseless. OpenAI did train on the Times
           | and chat gpt does output information from that training.
           | That's the basis of the lawsuit. NYT may lose, this could end
           | up being considered fair use, it might ultimately be a flimsy
           | basis for a lawsuit, but to say it's baseless (and with
           | nothing to back that up) is spin and makes this message less
           | reassuring.
        
             | tptacek wrote:
             | No, it's not. It's absolutely standard corporate
             | communications. If they're fighting the lawsuit, that is
             | essentially the only thing they can say about it. Ford
             | Motor Company would say the same thing (well, they'd
             | probably say "meritless and frivolous").
        
               | bee_rider wrote:
               | Standard corporate spin, then?
        
               | tptacek wrote:
               | No? "Spin" implies there was something else they could
               | possibly say.
        
               | mmooss wrote:
               | I haven't heard that interpretation; I might call it spin
               | of spin.
        
               | justacrow wrote:
               | They could choose to not say it
        
               | ethbr1 wrote:
               | Indeed. Taken to its conclusion, this thread suggests
               | that corporations are justified in saying whatever they
               | want in order to further their own ends.
               | 
               | Including lies.
               | 
               | I'd like to aim a little higher, maybe towards expecting
               | correspondence with reality?
               | 
               | IOW, yes, there is no law that OpenAi can't try to spin
               | this. But it's still a shitty, non-factually-based choice
               | to make.
        
               | mrgoldenbrown wrote:
               | If you're being held at gunpoint and forced to lie, your
               | words are still a lie. Whether you were forced or not is
               | a separate dimension.
        
               | bee_rider wrote:
               | That is unrelated to what the expression means.
        
               | bunderbunder wrote:
               | No, this isn't even close to spin, it's just a standard
               | part of defending your case. In the US tort system you
               | need to be constantly publicly saying you did nothing
               | wrong. Any wavering on that point could be used against
               | you in court.
        
               | jmull wrote:
               | This is a funny thread. You say "No" but then restate the
               | point with slightly different words. As if anything a
               | company says publicly about ongoing litigation isn't
               | spin.
        
               | bunderbunder wrote:
               | I suppose it's down to how you define "spin". Personally
               | I'm in favor of a definition of the term that doesn't
               | excessively dilute it.
        
               | bee_rider wrote:
               | Can you share your definition? This is actually quite
               | puzzling because as far as I know "spin" has always been
               | associated with presenting things in a way that benefits
               | you. Like, decades ago, they could have the show "Bill
               | O'Rilley's No Spin Zone" and everybody knew the premise
               | was that they argue against guests who were trying to
               | tell a "massaged" version of the story, and that they'd
               | go for some actual truth (fwiw I thought the whole show
               | was full of crap, but the name was not confusing or
               | ambiguous).
               | 
               | I'm not aware of any definition of "spin" where being
               | conventional is a defense against that accusation.
               | Actually, that was the (imagined) value-add of the show,
               | that conventional corporate and political messaging is
               | heavily spun.
        
             | adamsb6 wrote:
             | I'm typing these words from a brain that has absorbed
             | copyrighted works.
        
           | mmooss wrote:
           | > It's not an attempt to spin the lawsuit; it's about
           | reassuring their customers.
           | 
           | It can be both. It clearly spins the lawsuit - it doesn't
           | present the NYT's side at all.
        
             | fallingknife wrote:
             | Why does OpenAI have any obligation to present the NYTs
             | side?
        
               | mmooss wrote:
               | Who said 'obligation'?
        
             | roywiggins wrote:
             | It would be extremely unusual (and likely very stupid) for
             | the defendant in a lawsuit to post publicly that the
             | plaintiff maybe has a point.
        
           | mhitza wrote:
           | My understanding is that they have to keep chats based on an
           | order, *as a result of their previous accidental deletion of
           | potential evidence in the case*[0].
           | 
           | And per their own terms they likely only delete messages
           | "when they want to" given the big catch-alls. "What happens
           | when you delete a chat? -> It is scheduled for permanent
           | deletion from OpenAI's systems within 30 days, unless: It has
           | already been de-identified and disassociated from your
           | account"[1]
           | 
           | [0] https://techcrunch.com/2024/11/22/openai-accidentally-
           | delete...
           | 
           | [1] https://help.openai.com/en/articles/8809935-how-to-
           | delete-an...
        
           | conartist6 wrote:
           | It's hard to reassure your customers if you can't address the
           | elephant in the room. OpenAI brought this on themselves by
           | flaunting copyright law and assuring everyone else that such
           | aggressive and probably-illegal action would be retroactively
           | acceptable once they were too big to fail.
        
           | ofjcihen wrote:
           | They should include the part where the order is a result of
           | them deleting things they shouldn't have then. You know, if
           | this isn't spin.
           | 
           | Then again I'm starting to think OpenAI is gathering a cult
           | leader like following where any negative comments will result
           | in devoted followers or those with something to gain
           | immediately jumping to its defense no matter how flimsy the
           | ground.
        
             | gruez wrote:
             | >They should include the part where the order is a result
             | of them deleting things they shouldn't have then. You know,
             | if this isn't spin.
             | 
             | From what I can tell from the court filings, prior to the
             | judge's order to retain everything, the request to retain
             | everything was coming from the plaintiff, with openai
             | objecting to the request and refusing to comply in the
             | meantime. If so, it's a bit misleading to characterize this
             | as "deleting things they shouldn't have", because what they
             | "should have" done wasn't even settled. That's a bit rich
             | coming from someone accusing openai of "spin".
        
               | ofjcihen wrote:
               | Here's a good article that explains what you may be
               | missing.
               | 
               | https://techcrunch.com/2024/11/22/openai-accidentally-
               | delete...
        
               | gruez wrote:
               | Your linked article talks about openai deleting training
               | data. I don't see how that's related to the current
               | incident, which is about user queries. The ruling from
               | the judge for openai to retain all user queries also
               | didn't reference this incident.
        
               | ofjcihen wrote:
               | Sure.
               | 
               | Without this devolving into a tit for tat then the
               | article explains for those following this conversation
               | why it's been elevated to a court order and not just an
               | expectation to preserve.
        
               | lcnPylGDnU4H9OF wrote:
               | > the article explains for those following this
               | conversation why it's been elevated to a court order
               | 
               | That article does nothing of the sort and, indeed, it is
               | talking about a completely separate incident of deleting
               | data.
        
               | ofjcihen wrote:
               | No worries. I can't force understanding on anyone.
               | 
               | Here. I had an LLM summarize it for you.
               | 
               | A court order now requires OpenAI to retain all user
               | data, including deleted ChatGPT chats, as part of the
               | ongoing copyright lawsuit brought by The New York Times
               | (NYT) and other publishers[1][2][6][7]. This order was
               | issued because the NYT argued that evidence of copyright
               | infringement--such as AI outputs closely matching NYT
               | articles--could be lost if OpenAI continued its standard
               | practice of deleting user data after 30 days[2][6][7].
               | 
               | This new requirement is directly related to a 2024
               | incident where OpenAI accidentally deleted critical data
               | that NYT lawyers had gathered during the discovery
               | process. In that incident, OpenAI engineers erased
               | programs and search result data stored by NYT's legal
               | team on dedicated virtual machines provided for examining
               | OpenAI's training data[3][4][5]. Although OpenAI
               | recovered some of the data, the loss of file structure
               | and names rendered it largely unusable for the lawyers'
               | purposes[3][5]. The court and NYT lawyers did not believe
               | the deletion was intentional, but it highlighted the
               | risks of relying on OpenAI's internal data retention and
               | deletion practices during litigation[3][4][5].
               | 
               | The court order to retain all user data is a direct
               | response to concerns that important evidence could be
               | lost--just as it was in the accidental deletion
               | incident[2][6][7]. The order aims to prevent any further
               | loss of potentially relevant information as the case
               | proceeds. OpenAI is appealing the order, arguing it
               | conflicts with user privacy and their established data
               | deletion policies[1][2][6][7].
               | 
               | Sources [1] OpenAI Appeals Court Order Requiring
               | Retention of Consumer Data
               | https://www.pymnts.com/artificial-
               | intelligence-2/2025/openai... [2] 'An Inappropriate
               | Request': OpenAI Appeals ChatGPT Data Retention Court
               | Order https://www.eweek.com/news/openai-privacy-appeal-
               | new-york-ti... [3] OpenAI Deletes Legal Data in a Lawsuit
               | From the New York Times
               | https://www.businessinsider.com/openai-delete-legal-data-
               | law... [4] NYT vs OpenAI case: OpenAI accidentally
               | deleted case data
               | https://www.medianama.com/2024/11/223-new-york-times-
               | openai-... [5] New York Times Says OpenAI Erased
               | Potential Lawsuit Evidence
               | https://www.wired.com/story/new-york-times-openai-erased-
               | pot... [6] How we're responding to The New York Times'
               | data ... - OpenAI https://openai.com/index/response-to-
               | nyt-data-demands/ [7] Why OpenAI Won't Delete Your
               | ChatGPT Chats Anymore: New York ...
               | https://coincentral.com/why-openai-wont-delete-your-
               | chatgpt-... [8] A Federal Judge Ordered OpenAI to Stop
               | Deleting Data - Adweek
               | https://www.adweek.com/media/a-federal-judge-ordered-
               | openai-... [9] OpenAI confronts user panic over court-
               | ordered retention of ChatGPT logs
               | https://arstechnica.com/tech-policy/2025/06/openai-
               | confronts... [10] OpenAI Appeals 'Sweeping, Unprecedented
               | Order' Requiring It Maintain All ChatGPT Logs
               | https://gizmodo.com/openai-appeals-sweeping-
               | unprecedented-or... [11] OpenAI accidentally deleted
               | potential evidence in NY ... - TechCrunch
               | https://techcrunch.com/2024/11/22/openai-accidentally-
               | delete... [12] OpenAI's Shocking Blunder: Key Evidence
               | Vanishes in NY Times ...
               | https://www.eweek.com/news/openai-deletes-potential-
               | evidence... [13] Judge allows 'New York Times' copyright
               | case against OpenAI to go ...
               | https://www.npr.org/2025/03/26/nx-s1-5288157/new-york-
               | times-... [14] OpenAI Data Retention Court Order:
               | Implications for Everybody https://hackernoon.com/openai-
               | data-retention-court-order-imp... [15] Sam Altman calls
               | for 'AI privilege' as OpenAI clarifies court order to
               | retain temporary and deleted ChatGPT sessions
               | https://venturebeat.com/ai/sam-altman-calls-for-ai-
               | privilege... [16] Court orders OpenAI to preserve all
               | ChatGPT logs, including deleted ...
               | https://techstartups.com/2025/06/06/court-orders-openai-
               | to-p... [17] OpenAI deleted NYT copyright case evidence,
               | say lawyers https://www.theregister.com/2024/11/21/new_yo
               | rk_times_lawyer... [18] OpenAI slams court order to save
               | all ChatGPT logs, including ...
               | https://simonwillison.net/2025/Jun/5/openai-court-order/
               | [19] OpenAI accidentally deleted potential evidence in
               | New York Times ... https://mashable.com/article/openai-
               | accidentally-deleted-pot... [20] OpenAI slams court order
               | to save all ChatGPT logs, including deleted chats
               | https://news.ycombinator.com/item?id=44185913 [21] OpenAI
               | slams court order to save all ChatGPT logs, including
               | deleted chats https://arstechnica.com/tech-
               | policy/2025/06/openai-says-cour... [22] After court
               | order, OpenAI is now preserving all ChatGPT and API logs 
               | https://www.reddit.com/r/LocalLLaMA/comments/1l3niws/afte
               | r_c... [23] OpenAI accidentally erases potential evidence
               | in training data lawsuit
               | https://www.theverge.com/2024/11/21/24302606/openai-
               | erases-e... [24] OpenAI "accidentally" erased ChatGPT
               | training findings as lawyers ... https://www.reddit.com/r
               | /aiwars/comments/1gwxr94/openai_acci... [25] OpenAI
               | appeals data preservation order in NYT copyright case
               | https://www.reuters.com/business/media-telecom/openai-
               | appeal...
        
               | lcnPylGDnU4H9OF wrote:
               | You linked this article:
               | 
               | https://techcrunch.com/2024/11/22/openai-accidentally-
               | delete...
               | 
               | Gruez said that is talking about an incident in this case
               | but unrelated to the judge's order in question.
               | 
               | You said the article "explains for those following this
               | conversation why it's been elevated to a court order" but
               | it doesn't actually explain that. It is talking about
               | separate data being deleted in a different context. It is
               | not user chats and access logs. It is the data that was
               | used to train the models.
               | 
               | I pointed that out a second time since it seemed to be
               | misunderstood.
               | 
               | Then you posted an LLM summary of something unrelated to
               | the point being made.
               | 
               | Now we're here.
               | 
               | As you say, one cannot force understanding on another; we
               | all have to do our part. ;)
               | 
               | Edit:
               | 
               | > The court order to retain all user data is a direct
               | response to concerns that important evidence could be
               | lost--just as it was in the accidental deletion
               | incident[2][6][7].
               | 
               | What did you prompt the LLM with for it to reach this
               | conclusion? The [2][6][7] citations similarly don't seem
               | to explain how that incident from months ago informed the
               | judge's recent decision. Anyway, I'm not saying the
               | conclusion is wrong, I'm saying the article you linked
               | does not support the conclusion.
        
               | ofjcihen wrote:
               | I think in your rush to reply you may have not read the
               | summarization.
               | 
               | Calm down, cool off, and read it again.
               | 
               | The point is that the circumstances of the incident in
               | 2024 are directly related to the how and why of the NYT
               | lawyers request and the judges order.
               | 
               | The article I linked was to the incident in 2024.
               | 
               | Not everything has to be about pedantry and snark, even
               | on HN.
               | 
               | Edit: I see you edited your response after re-reading the
               | summarization. I'm glad cooler heads have prevailed.
               | 
               | The prompt was simply "What is the relation, if any,
               | between OpenAI being ordered to retain user data and the
               | incident from 2024 where OpenAI accidentally deleted the
               | NYT lawyers data while they were investigating whether
               | OpenAI had used their data to train their models?"
        
               | lcnPylGDnU4H9OF wrote:
               | > I see you edited your response after re-reading the
               | summarization.
               | 
               | Just to be clear, the summary is not convincing. I do
               | understand the idea but none of the evidence presented so
               | far suggests that was the reason. The court expected that
               | the data would be retained, the court learned that it was
               | not, the court gave an order for it to be retained. That
               | is the seeming reason for the order.
               | 
               | Put another way: if the incident last year had not
               | happened, the court would still have issued the order
               | currently under discussion.
        
         | lxgr wrote:
         | If the stored data is found to be relevant to the lawsuit
         | during discovery, it becomes available to at least both parties
         | involved and the court, as far as I understand.
        
         | hiddencost wrote:
         | > So, by OpenAI's own admission, they are taking abundant and
         | presumably effective steps to protect user privacy here? In the
         | unlikely event that this data did somehow leak, I'd personally
         | be blaming OpenAI, not the NYT.
         | 
         | I am not an Open AI stan, but this needs to be responded to.
         | 
         | The first principle of information security is that all systems
         | can be compromised and the only way to secure data is to not
         | retain it.
         | 
         | This is like saying "well I know they didn't want to go sky
         | diving but we forced them to go sky diving and they died
         | because they had a stroke mid air, it's their fault they
         | died.".
         | 
         | Anyone who makes promises about data security is at best
         | incompetent and at worst dishonest.
        
           | JohnKemeny wrote:
           | > _Anyone who makes promises about data security is at best
           | incompetent and at worst dishonest._
           | 
           | Shouldn't that be "at best dishonest and at worst
           | incompetent"?
           | 
           | I mean, would you rather be a competent person telling a lie
           | or an incompetent person believing you're competent?
        
             | HPsquared wrote:
             | An incompetent but honest person is more likely to accept
             | correction and respond to feedback generally.
        
           | nhecker wrote:
           | Data is a toxic asset. -- https://www.schneier.com/essays/arc
           | hives/2016/03/data_is_a_t...
        
         | pritambarhate wrote:
         | May be because you are not OpenAI user. I am. I find it useful
         | and I pay for it. I don't want my data to be retained beyond
         | what's promised in the Terms of Use and Privacy Policy.
         | 
         | I don't think the Judge is equipped to handle this case if they
         | don't understand how their order jeopardies the privacy of
         | millions of users worldwide who don't even care about NYT's
         | content or bypassing their paywalls.
        
           | mmooss wrote:
           | > who don't even care about NYT's content or bypassing their
           | paywalls.
           | 
           | Whether or not you care is not relevant, and is usually the
           | case for customers. If a drug company resold an expensive
           | cancer drug without IP, you might say 'their order jeopardies
           | the health of millions of users worldwide who don't even care
           | about Drug Co's IP.
           | 
           | If the NYT is right - I can only guess - then you are
           | benefitting from the NYT IP. Why should you get that without
           | their consent and for free - because you don't care?
           | 
           | > (jeapordizes)
           | 
           | ... is a strong word. I don't see much risk - the NYT isn't
           | going to de-anonymize users and report on them, or sell the
           | data (which probably would be illegal). They want to see if
           | their content is being used.
        
           | conartist6 wrote:
           | You live on a pirate ship. You have no right to ignore the
           | ethics and law of that just because you could be hurt in
           | conflict related to piracy
        
           | DrillShopper wrote:
           | The OpenAI Privacy Policy specifically allows them to keep
           | data as required by law.
        
       | sega_sai wrote:
       | Strange smear against NYT. If NYT has a case, and the court
       | approves that, it's bizarre to to use the court order to smear
       | NYT. If there is no case, "Open"AI will have a chance to prove
       | its case in court.
        
         | tptacek wrote:
         | They're a party to the case! Saying it's baseless isn't a
         | "smear". There is literally nothing else they can say (other
         | than something synonymous with "baseless", like "without
         | merit").
        
           | lucianbr wrote:
           | Oh they definitely _can_ say other things. It 's just that it
           | would be inconvenient. They might lose money.
           | 
           | I wonder if the laws and legal procedures are written
           | considering this general assumption that a party to a lawsuit
           | will naturally lie if it is in their interest. And then I
           | read articles and comments about a "trust based society"...
        
             | tptacek wrote:
             | I'm not taking one side or the other in the case itself,
             | but it's lazy and superficial to suggest that the defendant
             | in a civil suit would say anything other than that the suit
             | has no merit. The version of this statement where they
             | generously interpret anything the NYT (I subscribe) says,
             | they might as well just surrender.
             | 
             | I'm not sticking up for OpenAI so much as just for decent,
             | interesting threads here.
        
             | wilg wrote:
             | > They might lose money.
             | 
             | I expect it's more about them losing the _case_. Silly to
             | expect someone fighting a lawsuit not to try to win it.
        
             | fastball wrote:
             | This is the nature of the civil court system - it exists
             | for when parties disagree.
             | 
             | Why would a defendant who agrees a case has merit go to
             | court at all? Much easier (and generally less expensive) to
             | make the other party whole, assuming the parties agree on
             | what "whole" is. And if they don't agree on what "whole"
             | is, we are back to square one and of course you'd maintain
             | that the other side's suit is baseless.
        
           | mmooss wrote:
           | They could say nothing about the merits of the case.
        
         | lxgr wrote:
         | The NYT is, in my view, exploiting a systematic weakness of the
         | US legal system here, i.e. extremely wide reaching discovery
         | laws with almost no regard for the privacy of parties not
         | involved to a given dispute, or aspects of their lives not
         | relevant to the dispute at hand.
         | 
         | Of course it's out of self-serving interests, but I find it
         | hard to disagree with OpenAI on this one.
        
           | Arainach wrote:
           | What right to privacy? There is no right to have your
           | interactions with a company (1) remain private, nor should
           | there be. Even if there was you agree to let OpenAI do
           | essentially whatever they want with your data - including
           | hand it over to the courts in response to a subpoena.
           | 
           | (1) With limited well scoped exclusions for lawyers, medical
           | records, erc.
        
             | lxgr wrote:
             | That may be your or your jurisdiction's view, but such
             | privacy rights definitely exist in many countries.
             | 
             | You might have heard of the GDPR, but even before that,
             | several countries had "privacy by default" laws on the
             | books.
        
             | Imustaskforhelp wrote:
             | But if both the parties agree, then there should be The
             | freedom to stay private.
             | 
             | Your comment is dystopian given how the interaction is
             | basically like how some people treat ai as their "friend"
             | imagine no matter what encrypted messaging app or smth they
             | use, the govt still snoops
        
               | fastball wrote:
               | Dealer-Client privilege.
        
             | bionhoward wrote:
             | It's also a matter of competition...there are other AI
             | services available today with various privacy policies
             | ranging from no training by default, ability to opt out of
             | training, ability to turn off data retention, or e2e
             | encryption. A lot of workloads (cough, working on private
             | git repos) logically require private AI to make sense
        
             | ChadNauseam wrote:
             | Given how many important interactions people have with
             | companies in our modern age, saying "There is no right to
             | have your interactions with a company remain private" is
             | essentially equivalent to saying "there is no right to
             | privacy at all". When I talk to my friends over facetime or
             | imessage, that interaction is being mediated by Apple, as
             | well as by my internet service provider and (I assume) many
             | other parties.
        
               | wvenable wrote:
               | > "There is no right to have your interactions with a
               | company remain private" is essentially equivalent to
               | saying "there is no right to privacy at all".
               | 
               | Legally that is a correct statement.
               | 
               | If you want that changed, it will require legislation.
        
               | HDThoreaun wrote:
               | Really not so simple. Roe v Wade was decided based on the
               | implied right to privacy. Sure its been overturned but if
               | liberals get back on the court it will be un-overturned
        
               | nativeit wrote:
               | That's presumably why legislation is needed?
        
               | maketheman wrote:
               | Given the current balance of the court, I'd say it's
               | about even odds we end the entire century without ever
               | having had a liberal court the entire time. Best
               | reasonable case we're a solid couple of decades from it,
               | and even that's not got _great_ odds.
               | 
               | We'd have a better chance if anyone with power were
               | talking about court reform to make the Supreme Court
               | justices e.g. drawn by lot for each session from the
               | district courts, but approximately nobody is. It'd be
               | damn good and long overdue reform, but oh well.
               | 
               | And the thing is, we've already had a fairly conservative
               | court for _decades_. I 'm pretty likely to die, even if
               | of old age, never having seen an actually-liberal court
               | in the US my entire life. Like, WTF. Frankly, no wonder
               | so much of our situation is fucked up, backwards, and
               | authoritarianism-friendly. And (sigh) any serious
               | attempts to fix that are basically on hold for many
               | decades more, assuming rule of law survives that long
               | anyway.
               | 
               | [EDIT] My point, in short, is that "we still have
               | [thing], we just have to wait for a liberal court that'll
               | support it" is functionally indistinguishable from _not_
               | having [thing].
        
               | fallingknife wrote:
               | A liberal court will probably start drawing exceptions to
               | 1A out of thin air like "misinformation" and "hate
               | speech." I'd rather stick with what we have.
        
               | wvenable wrote:
               | Roe v Wade refers to the constitutional right to privacy
               | under the Due Process Clause of the 14th Amendment. This
               | is part of individual rights against the state and has
               | nothing to do with private companies. There is no general
               | constitutional right that guarantees privacy in
               | interactions with private companies.
        
               | whilenot-dev wrote:
               | Privacy in that example would be if no party except you
               | and your friends can access the contents of this
               | interaction. I wouldn't want neither Apple nor my ISP to
               | have that access.
               | 
               | A company like OpenAI that offers a SaaS is no such
               | friend, and in such power dynamics (individual VS
               | company) it's probably in your best interest to have
               | everything public if necessary.
        
               | lxgr wrote:
               | You're always free to keep records of your ChatGPT
               | conversations _on your end_.
               | 
               | Why tangle the data of people with very different
               | preferences than yours up in that?
        
               | bobmcnamara wrote:
               | > "there is no right to privacy at all"
               | 
               | First time?
        
               | Analemma_ wrote:
               | > essentially equivalent to saying "there is no right to
               | privacy at all".
               | 
               | As others have said, in the United States this is,
               | legally, completely correct: there is no right to privacy
               | in American law. Lots of people think the Fourth
               | Amendment is a general right to privacy, and they are
               | wrong: the Fourth Amendment is specifically about
               | government search and seizure, and courts have been
               | largely consistent about saying it does not extend beyond
               | that to e.g. relationships with private parties.
               | 
               | If you want a right to privacy, you will need to advocate
               | for laws to be changed; the ones as they exist now do not
               | give it to you.
        
               | tiahura wrote:
               | No that is incorrect. See eg griswold, lawrence etc.
        
               | Terr_ wrote:
               | That's a fallacy of equivocation, you're introducing a
               | different meaning/flavor of the same word.
               | 
               | As it stands today, a court case (A) affirming the right
               | to use contraception is not equivalent to a court case
               | (B) stating that a phone-company/ISP/site may not sell
               | their records of your activity.
        
               | tiahura wrote:
               | Your response hinges on a fallacy of equivocation, but
               | ironically, it commits one as well.
               | 
               | You conflate the absence of a statutory or regulatory
               | regime governing private data transactions with the
               | broader constitutional right to privacy. While it's true
               | that the Fourth Amendment limits only state action, U.S.
               | constitutional law, via cases like Griswold v.
               | Connecticut and Lawrence v. Texas, and clearly recognizes
               | a substantive right to privacy, grounded in the Due
               | Process Clause and other constitutional penumbras. This
               | is not a semantic variant; it is a distinct and
               | judicially enforceable right.
               | 
               | Moreover, beyond constitutional law, the common law
               | explicitly protects privacy through torts such as
               | intrusion upon seclusion, public disclosure of private
               | facts, false light, and appropriation of likeness. These
               | apply to private actors and are recognized in nearly
               | every U.S. jurisdiction.
               | 
               | Thus, while the Constitution may not prohibit a website
               | from selling your data, it does affirm a right to privacy
               | in other, fundamental contexts. To deny that entirely is
               | legally incorrect.
        
               | jcalvinowens wrote:
               | In practice, the constitution says whatever the supreme
               | court says it says.
               | 
               | While these grand theories of traditional implicit
               | constitutional law are nice, they're pretty meaningless
               | in a system where five individuals can (and are willing
               | to) vote to invalidate decades of tradition on a whim.
               | 
               | I too want real laws.
        
               | wvenable wrote:
               | You're conflating the existence of specific privacy
               | protections in narrow legal domains with a generalized,
               | enforceable right to privacy which doesn't exist in US
               | law. The Constitution recognizes a substantive right to
               | privacy, but only in carefully defined areas like
               | reproductive choice, family autonomy, and intimate
               | conduct, and critically only against _state actor_ s.
               | Citing Griswold, Lawrence, and related cases does not
               | establish a sweeping privacy right enforceable against
               | _private companies_.
               | 
               | Common law requires a high threshold of offensiveness and
               | are adjudicated on a case-by-case in individual
               | jurisdictions. They offer only remedies and not a
               | proactive right to control your data.
               | 
               | The original point, that there is no general right in the
               | US to have your interactions with a company remain
               | private, still stands. That's not a denial of all privacy
               | rights but a recognition that US law fails to provide
               | comprehensive privacy protection.
        
               | tiahura wrote:
               | The statement I was referring to is:
               | 
               | "As others have said, in the United States this is,
               | legally, completely correct: there is no right to privacy
               | in American law."
               | 
               | That is an incorrect statement. The common law torts I
               | cited can apply in the context of a business transaction,
               | so your statement is also incorrect.
               | 
               | If you're strawman is that in the US there's no right to
               | privacy because there's no blanket prohibition on talking
               | about other people, and what they've been up to, then run
               | with it.
        
               | wvenable wrote:
               | > The common law torts I cited can apply in the context
               | of a business transaction, so your statement is also
               | incorrect.
               | 
               | I completely disagree. Yes, the Prosser privacy torts
               | exist: intrusion upon seclusion, public disclosure, false
               | light, and appropriation. But they are highly fact-
               | specific, hard to win, rarely litigated, not recognized
               | in all jurisdictions, and completely reactive -- you get
               | harmed first, maybe sue later!
               | 
               | They are utterly inadequate to protect people in the
               | modern data economy. A website selling your purchase
               | history? Not actionable. A company logging your AI chats?
               | Not intrusion. These torts are _not_ a privacy regime -
               | they are scraps. Also when we 're talking about basic
               | privacy rights, we just as concerned with mundane
               | material not just "highly offensive" material that the
               | torts would apply to.
        
               | tiahura wrote:
               | Because in the US we value freedom and particularly
               | freedom of speech.
               | 
               | If don't want the grocery store telling people you buy
               | Coke, don't shop there.
        
               | wvenable wrote:
               | So you've entirely given up your argument about the legal
               | right to privacy involving private businesses?
        
               | tiahura wrote:
               | no, i'm saying that in many contexts it is. If for
               | example, someone hacked Safeway's store and downloaded
               | your data, they'd be in trouble civilly and criminally.
               | If you don't want safeway to sell your data, deal with
               | that yourself.
        
               | wvenable wrote:
               | That actually reinforces my point: there is no
               | affirmative right to privacy, only reactive liability
               | structures. If someone hacks Safeway, they're prosecuted
               | not because you have a constitutional or general right to
               | privacy, but because they violated a criminal statute
               | (e.g. the Computer Fraud and Abuse Act). That's not a
               | privacy right -- it's a prohibition on unauthorized
               | access.
               | 
               | As for Safeway selling your data: you're admitting that
               | it's on the individual to opt out, negotiate, or avoid
               | the transaction which just highlights the absence of a
               | rights-based framework. The burden is entirely on the
               | consumer to protect themselves, and companies can exploit
               | that asymmetry unless narrowly constrained by statute
               | (and even then, often with exceptions and opt-outs).
               | 
               | What you're describing isn't a right to privacy -- it's a
               | lack of one, mitigated only by scattered laws and
               | personal vigilance. That is precisely the problem.
        
             | fc417fc802 wrote:
             | > There is no right to have your interactions with a
             | company (1) remain private, nor should there be.
             | 
             | Why should two entities not be able to have a confidential
             | interaction if that is what they both want? Certainly a
             | court order could supersede such a right just as it could
             | most others provided sufficient evidence. However I would
             | expect such things to be both highly justified and narrowly
             | targeted.
             | 
             | This specific case isn't so much about a right to privacy
             | as it is a more general freedom to enter into contracts
             | with others and expect those to be honored.
        
               | nativeit wrote:
               | Hey man, wanna buy some coke? How about trade secrets?
               | State secrets?
        
             | 1shooner wrote:
             | >(1) With limited well scoped exclusions for lawyers,
             | medical records, erc.
             | 
             | Is this referring to some actual legal precedent, or just
             | your personal opinion?
        
             | levocardia wrote:
             | But there's a very big difference between "no company is
             | legally required to keep your data private" and "a company
             | that explicitly and publically wants to protect your
             | privacy is being legally coerced into not keeping your data
             | private"
        
               | nativeit wrote:
               | No room here for the company's purely self-interested
               | motivations?
        
             | davedx wrote:
             | Hello. I live in the EU. Have you heard of GDPR?
        
           | JumpCrisscross wrote:
           | > _with almost no regard for the privacy of parties not
           | involved to a given dispute_
           | 
           | Third-party privacy and relevance is a constant point of
           | contestion in discovery. Exhibit A: this article.
        
           | thinkingtoilet wrote:
           | The privacy onus is entirely on the company. If Open AI is
           | concerned about user privacy then don't collect that data.
           | End of story.
        
             | acheron wrote:
             | ...the whole point of this story is that the court is
             | forcing them to collect the data.
        
               | thinkingtoilet wrote:
               | You're telling me you don't think Open AI is already
               | collecting chat logs?
        
               | dghlsakjg wrote:
               | Yes.
               | 
               | In the API that is an explicit option, as well as in the
               | paid consumer product as well. The amount of business
               | that they stand to lose by maliciously flouting that part
               | of their contract is in the billions.
        
               | thinkingtoilet wrote:
               | You can trust Sam Altman. I do not.
        
               | Workaccount2 wrote:
               | "I'm wrong so here is a conspiracy so I can be right
               | again".
               | 
               | Large companies lose far more by lying than they would
               | gain from it.
        
               | taormina wrote:
               | No no, they are being forced to KEEP the data they
               | collected. They didn't have to keep it to begin with.
        
               | pj_mukh wrote:
               | Isn't the only way to do that is for ChatGPT to run
               | locally on a machine? The moment your chat hits their
               | server they are legally required to store it?
        
         | wyager wrote:
         | Lots of people abuse the legal system in various ways. They
         | don't get a free pass just because their abuse is technically
         | legal itself.
        
         | visarga wrote:
         | NYT wants it both ways. When they were the ones putting
         | freelancer articles into a database to rent, they argued
         | against enforcing copyright and for supporting the new
         | industry, and that it was too hard to revert their original
         | assumptions. Now they absolutely love copyright.
         | 
         | https://harvardlawreview.org/blog/2024/04/nyt-v-openai-the-t...
        
           | moefh wrote:
           | Another way of looking at it is that they lost that case over
           | 20 years ago, and have been building their business model for
           | 20 years accordingly.
           | 
           | In other words, they want everyone to be forced to follow the
           | same rules they were forced to follow 20 years ago.
        
         | eviks wrote:
         | And if NYT has no case, but the court approves it, is that
         | still bizarre?
        
         | tootie wrote:
         | It's PR. OpenAI stole mountains of copyrighted content and are
         | trying to make NYT look like bad guys. OpenAI would not be in
         | the position of defending a lawsuit if they hadn't done
         | something that is very likely illegal. OpenAI can also end this
         | requirement right now by offering a settlement.
        
       | lxgr wrote:
       | Does anybody know if this also applies to "temporary chats" on
       | ChatGPT?
       | 
       | Given that it's not explicitly mentioned as data not being
       | affected, I'm assuming it is.
        
         | miles wrote:
         | > But now, OpenAI has been forced to preserve chat history even
         | when users "elect to not retain particular conversations by
         | manually deleting specific conversations or by starting a
         | 'Temporary Chat,' which disappears once closed," OpenAI said.
         | 
         | https://arstechnica.com/tech-policy/2025/06/openai-says-cour...
        
       | paxys wrote:
       | > Does this court order violate GDPR or my rights under European
       | or other privacy laws?
       | 
       | > We are taking steps to comply at this time because we must
       | follow the law, but The New York Times' demand does not align
       | with our privacy standards. That is why we're challenging it.
       | 
       | That's a lot of words to say "yes, we are violating GDPR".
        
         | esafak wrote:
         | Could a European court not have ordered the same thing? Is
         | there an exception for lawsuits?
        
           | lxgr wrote:
           | There is, but I highly doubt a European court would have
           | given such an order (or if they did, it would probably be
           | axed by a higher court pretty quickly).
           | 
           | There's decades of legal disputes in some European countries
           | on whether it's even legitimate for the government to mandate
           | your ISP or phone company to collect metadata on you for
           | after-the-fact law enforcement searches.
           | 
           | Looking at the actual data seems much more invasive than that
           | and, in my (non-legally trained) estimate doesn't seem like
           | it would stand a chance at least in higher courts.
        
             | dragonwriter wrote:
             | > There's decades of legal disputes in some European
             | countries on whether it's even legitimate for the
             | government to mandate your ISP or phone company to collect
             | metadata on you for after-the-fact law enforcement
             | searches.
             | 
             | > Looking at the actual data seems much more invasive than
             | that
             | 
             | Looking at the data isn't involved in the current order,
             | which requires OpenAI to preserve and segregate the data
             | that would otherwise have been deleted. The reason for
             | segregation is because any challenges OpenAI has to
             | _providing that data in disccovery_ will be heard before
             | anyone other than OpenAI is ordered to have access to the
             | data.
             | 
             | This is, in fact, less invasive than the government
             | mandating collection for speculative future uses, since it
             | applies _only_ to _not destroying_ evidence _already_
             | collected by OpenAI in the course of operating their
             | business, and only for _potential_ use, subject to other
             | challenges by OpenAI, in the present case.
        
         | kelvinjps wrote:
         | Maybe the will ot store the chats of the European users?
        
         | dragonwriter wrote:
         | That's what they are trying to suggest, because they are still
         | trying to use the GDPR as part of their argument challenging
         | the US court order. (Kind of a longshot to get a US court to
         | agree that the obligation of a US party to preserve evidence
         | related to a suit in US courts under US law filed by another US
         | party is mitigated by European regulations in any case, even if
         | their argument that such preservation would violate obligations
         | that the EU had imposed on them.)
        
         | 3836293648 wrote:
         | No, they're not, because the GDPR has an explicit exception for
         | when a court orders that a company keeps data for discovery.
         | It'd only be a GDPR violation if it's kept after this case is
         | over.
        
           | lompad wrote:
           | This is not correct.
           | 
           | > Any judgment of a court or tribunal and any decision of an
           | administrative authority of a third country requiring a
           | controller or processor to transfer or disclose personal data
           | may only be recognised or enforceable in any manner if based
           | on an international agreement, such as a mutual legal
           | assistance treaty, in force between the requesting third
           | country and the Union or a Member State, without prejudice to
           | other grounds for transfer pursuant to this Chapter.
           | 
           | So if, and only if, an agreement between the US and the EU
           | allows it explicitly, it is legal. Otherwise it is not.
        
       | atleastoptimal wrote:
       | I've always assumed that anything sent to any company's hosted
       | API will be logged forever. To assume otherwise always seemed
       | naive, like thinking that apps aren't tracking your web activity.
        
         | lxgr wrote:
         | Assuming the worst is wise, settling for the worst case outcome
         | without any fight seems foolish.
        
         | fragmede wrote:
         | privacy nhilism is a decision all on its own
        
           | morsch wrote:
           | I'd only call it nihilism if you are in agreement with the
           | grandparent and then do it anyway. Other choices are
           | pretending it's not true (denialism), or just not thinking
           | about (ignorance). Or you complicate your life by not
           | uploading your private info.
        
           | Barrin92 wrote:
           | not really, it's basically just being anti fragile. Consider
           | any corporate entity that interacts with you to be an
           | Eldritch horror from outer space that wants to siphon your
           | soul, because that's effectively what it is, and keep your
           | business with them to a minimum.
           | 
           | It's just realism. Protect your private data yourself,
           | relying on companies or governments to do it for you is like
           | the saying goes, letting a tiger devour you up to the neck
           | and then ask it to stop at the head
        
       | mosdl wrote:
       | Its funny that OpenAI is complaining, they don't mind saying
       | copyright doesn't apply to them if it makes them money.
        
         | ivape wrote:
         | In retrospect, Bezos did the smartest thing by buying the
         | Washington Post. In retrospect, Google did a great thing by
         | working on a deal with Reddit. Content repositories/creators
         | are going to sue these LLM companies in the West until they
         | make licensing agreements. If I were OpenAI, I'd work hard to
         | spend the money they raised to literally buyout as many of
         | these outlets as possible.
         | 
         | How much could the NYT back catalog be worth? Just buy it, ask
         | the Saudis.
        
       | WorldPeas wrote:
       | So how is this going to impact cursor's privacy mode, which is
       | required by many companies for compliant usage of AI editors? For
       | the uninitiated, in the web console this looks like:
       | 
       | Privacy mode (enforced across all seats)
       | 
       | OpenAI Zero-data-retention (approved)
       | 
       | Anthropic Zero-data-retention (approved)
       | 
       | Google Vertex AI Zero-data-retention (approved)
       | 
       | xAi Grok Zero-data-retention (approved)
       | 
       | did this just open another can of worms?
        
         | qmarchi wrote:
         | Likely, they're using OpenAI's Zero-Retention APIs where
         | there's never data stored in the first place.
         | 
         | So nothing?
        
           | JumpCrisscross wrote:
           | > _OpenAI 's Zero-Retention APIs_
           | 
           | Do we know if the court order covers these?
        
             | brigandish wrote:
             | Yes, follow the link at the top.
        
               | JumpCrisscross wrote:
               | > _Yes, follow the link at the top_
               | 
               | OpenAI says "this does not impact API customers who are
               | using Zero Data Retention endpoints under our ZDR
               | amendment."
        
         | 8note wrote:
         | at least, openai zero-data-retention will by court order be
         | full retention.
         | 
         | im excited that the law is going to push for local models
        
           | blerb795 wrote:
           | The linked page specifically mentions that these ZDR APIs are
           | not impacted.
           | 
           | > This does not impact API customers who are using Zero Data
           | Retention endpoints under our ZDR amendment.
        
       | junto wrote:
       | This is disingenuous from OpenAI.
       | 
       | They are being challenged because NYT believes that ChatGPT was
       | trained with copyrighted data.
       | 
       | NYT naively push to find a way to prove that NYT data is being
       | used in user chats and how often.
       | 
       | OpenAI spin that to NYT are invading user privacy.
       | 
       | It's quite transparent as to what they are doing here.
        
       | dumbmrblah wrote:
       | So is this for all chats going forward or does it include
       | conversations retroactively?
        
         | steve_adams_86 wrote:
         | Presumably moving forward, because otherwise the data retention
         | policies wouldn't have been followed correctly (from what I
         | understand)
        
       | kingkawn wrote:
       | Once the data is kept it is a matter of time til a new must-try
       | use for it will be born
        
       | john2x wrote:
       | Does this mean that if I can get ChatGPT to generate copyrighted
       | text, they'll get in trouble?
        
       | tiahura wrote:
       | Every concerned ChatGPT user should file an emergency motion to
       | intervene and request for stay of the order. ChatGPT can help you
       | draft the motion and proposed order, just give it a copy of the
       | discovery order. The SDNY has a very helpful pro se hotline.
       | 
       | The order the judge issued is irresponsible. Maybe ChatGPT did
       | get too cute in its discovery responses, but the remedy isn't to
       | trample the rights of third parties.
        
       | vessenes wrote:
       | This is a massssive overreach. Not in the nature of the request:
       | "please don't destroy data that might contain proof my case is
       | strong," but in the scale, and therefore it's a massive overreach
       | by the judge. But shame on NYT for asking.
       | 
       | This request also equals: "Please keep a backup of every
       | Senator's private chats, every Senator's spouse's private chats,
       | every military commander's personal chats, every politician in a
       | foreign country, forever."
       | 
       | There is no way that data will stay safe forever. There is no way
       | that, once such a facility is built, it will not be used
       | constantly, by governments all over the world.
       | 
       | The NYT case seems to currently be on whether or not OpenAI users
       | use ChatGPT to circumvent paywalls. Maybe they do, although when
       | the suit was filed, 3.5 was definitely not a reliable witness to
       | what NYT articles were about. There are 400 million MAUs at
       | ChatGPT - more than the population of the US.
       | 
       | To my mind there's three tranches of information that we could
       | find out:
       | 
       | 1. People's primary use case for ChatGPT is to get NYT articles
       | for free. Therefore oAI is a bad actor making a tool that largely
       | got profitable off infringing NYT's copyright.
       | 
       | 2. Some core segment used/uses it for infringement purposes; not
       | a lot, but it's a use case that sells licenses.
       | 
       | 3. This happens, but just vanishingly rarely compared to most use
       | cases of the tool.
       | 
       | I'd imagine different rulings and orders to cure in each of these
       | circumstances, but why is it that the court needs to know any
       | more than some percentages?
       | 
       | Assuming a 10k system prompt, 500 tokens of chat, 400mm people,
       | five chats a week, that comes to roughly 67 Terabytes of data per
       | week(!) No metadata, just ASCII output.
       | 
       | Nobody, ever, will read all of this. In fact, it would take about
       | 24 hours for a Seagate drive to just push all the bytes down a
       | bus, much less process any of it. Why not agree on representative
       | searches, get a team to spot check data, and go from there?
       | 
       | Personally, I would guess the percentage of "infringement" use
       | cases, IF it is even infringement to get an AI to verbatim quote
       | a news article while it is NOT infringement for Cloudflare to
       | give a verbatim quote of a news article, is going to be tiny,
       | tiny, tiny.
       | 
       | NYT should back the fuck off, remember it's supposed to be a
       | force for good in the world and not be the cause of massive
       | possible downstream harm to people all over the world.
        
         | fallingknife wrote:
         | It's obviously 3 because the entire point of the NYT is that
         | it's a newspaper and probably 99% of their traffic is from
         | articles new enough that they haven't had time to go into the
         | training data. So anybody who wanted to use ChatGPT to breach
         | the NYT paywall couldn't get any new articles. Also there are
         | so many other ways to breach a paywall that you would have to
         | be insane to try to do it through prompt engineering ChatGPT.
         | The whole case is a scam and I hope the court makes them pay
         | OpenAI's legal fees.
        
         | DrillShopper wrote:
         | > There is no way that data will stay safe forever. There is no
         | way that, once such a facility is built, it will not be used
         | constantly, by governments all over the world.
         | 
         | That's on OpenAI for deciding to retain this data in the first
         | place. They could just _not_ have done that. That was a choice,
         | _their choice_ , and therefore they're responsible for it.
        
       | throwaway6e8f wrote:
       | Agent-1, I want to legally retain all customer data indefinitely
       | but I'm worried about a backlash from the public. Also, I'm
       | having a bunch of problems with the NYT accusing us of copyright
       | violation. Give me a strategy to resolve these issues so that I
       | win in the long term.
        
       | dataflow wrote:
       | > ChatGPT Enterprise and ChatGPT Edu: Your workspace admins
       | control how long your customer content is retained. Any deleted
       | conversations are removed from our systems within 30 days, unless
       | we are legally required to retain them.
       | 
       | I'm confused, how does this not affect Enterprise or Edu? They
       | clearly possess the data, so what makes them different legally?
        
         | oxw wrote:
         | Enterprise has an exemption granted by the judge
         | 
         | > When we appeared before the Magistrate Judge on May 27, the
         | Court clarified that ChatGPT Enterprise is excluded from
         | preservation.
        
           | dataflow wrote:
           | Oh I missed that part, thanks. I wonder why. I guess the
           | judge assumes it isn't being used for copyright infringement,
           | but other plans might be?
        
             | bee_rider wrote:
             | No idea, but just to speculate--the court's goal isn't
             | actually to scare OpenAI's users or harm their business,
             | right? It is to collect evidence. Maybe they just figured
             | they don't need to dip into that pool to get enough
             | evidence.
        
             | Grikbdl wrote:
             | Who knows, it's probably the judge's twisted idea of
             | "that'd be too far", as if cancelling basic privacy
             | expectations of all users everywhere wouldn't be.
        
       | landonxjames wrote:
       | Repeatedly calling the lawsuit baseless feels like it makes Open
       | AI's point a lot weaker. They obviously don't like the suit, but
       | I don't think you can credibly argue that there aren't tricky
       | questions around the use of copyrighted materials in training
       | data. Pretending otherwise is disingenuous.
        
         | sigilis wrote:
         | They pay their lawyers and whoever made this page a lot for the
         | express purpose of credibly arguing that it is very clearly
         | totally legal and very cool to use of any IP they want to train
         | their models.
         | 
         | Could you with a straight face argue that the NYT newspaper
         | could be a surrogate girlfriend for you like a GPT can be? They
         | maintain that it is obviously a transformative use and
         | therefore not an infringement of copyright. You and I may
         | disagree with this assertion, but you can see how they could
         | see this as baseless, ridiculous, and frivolous when their
         | livelihoods depend on that being the case.
        
       | Caelus9 wrote:
       | Honestly, this incident makes me feel that it is really difficult
       | to draw a clear line between "protecting privacy" and "obeying
       | the law". On the one hand, I am very relieved that OpenAI stood
       | up and said "no". After all, we all know that these systems
       | collect everything by default, which makes people a little panic.
       | But on the other hand, it sounds very strange that the court can
       | directly say "give me all the data", even those that users
       | explicitly delete. Moreover, this also shows that everyone
       | actually cares about their information and privacy now. No one
       | wants to be used for anything casually.
        
       | wand3r wrote:
       | Does anyone know how this can be enforced?
       | 
       | The ruling and situation aside, to what degree is it possible to
       | enforce something like this and what are the penalties? Even in
       | GDPR and other data protection cases, it seems super hard to
       | enforce. Directives to keep or delete data basically require
       | system level access, because the company can always CRUD their
       | data whenever they want and whatever is in their best interest.
       | Data can ask to be produced to a court periodically and audited
       | which could maybe catch an individual case, I guess. There is
       | basically no way to know without literally seizing the servers in
       | an extreme case. Also, the consequences in most cases are a fine.
        
         | mmooss wrote:
         | This isn't the executive branch of the US government, which has
         | Constitutional powers. It's a private company and the court can
         | at least enforce massive penalties, presumptions against them
         | at trial (causing them to lose), and contempt of court. Talk to
         | a lawyer before you try something like it.
        
           | imiric wrote:
           | > the court can at least enforce massive penalties
           | 
           | A.k.a. the cost of doing business.
        
             | mmooss wrote:
             | Businesses care deeply about money. The bravado of many
             | businesspeople these days, that they are immune to
             | criticism, lawsuits, etc. is a bluff. It apparently works,
             | because many people repeat it.
        
               | imiric wrote:
               | When fines are a small percentage of the company's
               | revenue, they do nothing to stop them from breaking the
               | law. So they are in fact just the cost of doing business.
               | 
               | E.g. Meta has been fined billions many times, yet they
               | keep reoffending. It's basically become a revenue stream
               | for governments.
        
       | delusional wrote:
       | I have no time for this circus.
       | 
       | The technology anarchists in this thread need perspective. This
       | is fundamentally a case about the legality of this product. In
       | the extreme case, this will render the whole product category of
       | "llm trained on copyrighted content" illegal. In that case, you
       | will have been part of a copyright infringement on a truly
       | massive scale. The users of these tools do NOT deserve privacy in
       | the light of the crimes alleged.
       | 
       | You do not get to claim to protect the privacy of the customers
       | of your illegal venture.
        
       | 6510 wrote:
       | The harm this is doing and will do (regardless) seems to exceed
       | the value of the NYT.
       | 
       | If a company is subject to a US court order that violates EU law,
       | the company could face legal consequences in the EU for non-
       | compliance with EU law.
       | 
       | The GDPR mandates specific consent and legal bases for processing
       | data, including sharing it.
       | 
       | Assuming it is legal to share it for legal purposes one cant
       | sufficiently anonymize the data. It needs to be accompanied by
       | user data that allows requests to download it and for it to be
       | deleted.
       | 
       | I wonder what the fine would be if they just delete it per user
       | agreement.
       | 
       | I also wonder, could one, in the US, legally promise the customer
       | they may delete their data then chose to keep it indefinitely and
       | share it with others?
        
       | dvt wrote:
       | > Does this court order violate GDPR or my rights under European
       | or other privacy laws?
       | 
       | > We are taking steps to comply at this time because we must
       | follow the law, but The New York Times' demand does not align
       | with our privacy standards. That is why we're challenging it.
       | 
       | So basically no, lol. I wonder if we'll see the GDPR go head-to-
       | head with Copyright Law here, that would be way more fun than
       | OpenAI v NYT.
        
       | yoaviram wrote:
       | >Trust and privacy are at the core of our products. We give you
       | tools to control your data--including easy opt-outs and permanent
       | removal of deleted ChatGPT chats (opens in a new window) and API
       | content from OpenAI's systems within 30 days.
       | 
       | No you don't. You charge extra for privacy and list it as a
       | feature on your enterprise plan. Not event paying pro customer
       | get "privacy". Also, you refuse to delete personal data included
       | in your models and training data following numerous data
       | protection requests.
        
         | baxtr wrote:
         | This is a typical "corporate speak" / "trustwahsing" statement.
         | It's usually super vague, filled with feel-good buzzwords, with
         | a couple of empty value statements sprinkled on top.
        
         | that_was_good wrote:
         | Except all users can opt out. Am I missing something?
         | 
         | It says here:
         | 
         | > If you are on a ChatGPT Plus, ChatGPT Pro or ChatGPT Free
         | plan on a personal workspace, data sharing is enabled for you
         | by default, however, you can opt out of using the data for
         | training.
         | 
         | Enterprise is just opt out by default...
         | 
         | https://help.openai.com/en/articles/8983130-what-if-i-want-t...
        
           | agos wrote:
           | what about all the rest of the data they use for training,
           | there's no opt out from that
        
           | bartvk wrote:
           | Indeed. Click your profile in the top right, click on the
           | settings icon. In Settings, select "Data Controls" (not
           | "privacy") and then there's a setting called "Improve the
           | model for everyone" (not "privacy" or "data sharing") and
           | turn it off.
        
             | bugtodiffer wrote:
             | so they technically kind of follow the law but make it as
             | hard as possible?
        
               | bartvk wrote:
               | Personally I feel it's okay but kinda weird. I mean why
               | not call it privacy. Gray pattern, IMHO. For example
               | venice.ai simply doesn't have a privacy setting because
               | they don't use the data from chats. (They do have basic
               | telemetry, and the setting is called "Disable Telemetry
               | Collection").
        
           | atoav wrote:
           | Not sharing you data with other users does not mean the data
           | of a deleted chat are gone, those are very likely two
           | completely different mechanisms.
           | 
           | And whether and how they use your data for their own purposes
           | isn't touched by that either.
        
       | Kiyo-Lynn wrote:
       | Lately I'm not even sure if the things I say on OpenAI are really
       | mine or just part of the platform. I never used to think much
       | when chatting, but knowing some of it might be stored for a long
       | time makes me feel uneasy. I'm not asking for much. I just want
       | what I delete to actually be gone.
        
       | nraynaud wrote:
       | Isn't Altman collecting millions of eye scans? Since when did he
       | care about privacy?
        
       | CjHuber wrote:
       | Even though how they responded is definitely controversial, I'm
       | glad that they did publicize some response to it. After reading
       | about it in the news yesterday and seeing no response on their
       | side yet, I was worried that they would just keep silent
        
       | molf wrote:
       | It would help tremendously if OpenAI would make it possible to
       | apply for zero data retention (ZDR). For many business needs
       | there is no reason to store or log any request at all.
       | 
       | In theory it is possible to apply (it's mentioned on multiple
       | locations in the documentation), but in practice requests are
       | just being ignored. I get that approval needs to be given, and
       | that there are barriers to entry. But it seems to me they mention
       | zero-data retention only for marketing purposes.
       | 
       | We have applied multiple times and have yet to receive ANY
       | response. Reading through the forums this seems very common.
        
         | pclmulqdq wrote:
         | The missing ingredient is money.
        
           | jewelry wrote:
           | not just money. How are you going to support this client's
           | support ticket if there is no log at all?
        
             | ethbr1 wrote:
             | Don't. "We're unable to provide support for your request,
             | because you disabled retention." Easy.
        
               | hirsin wrote:
               | They don't care, they still want support and most
               | leadership teams are unwilling to stand behind a stance
               | of telling customers no.
        
               | abeppu wrote:
               | ... but why is not responding to a request for zero
               | retention today better than not being able to respond to
               | a future request? They're basically already saying no to
               | customers who request this capability that they said they
               | support, but their refusal is in the form of never
               | responding.
        
         | belter wrote:
         | If this stands I dont think they can operate in the EU
        
           | bunderbunder wrote:
           | I highly doubt this court order affects people using OpenAI
           | services from the EU, as long as they're connecting to EU-
           | based servers.
        
             | glookler wrote:
             | >> Does this court order violate GDPR or my rights under
             | European or other privacy laws?
             | 
             | >> We are taking steps to comply at this time because we
             | must follow the law, but The New York Times' demand does
             | not align with our privacy standards. That is why we're
             | challenging it.
        
               | danielfoster wrote:
               | They didn't say which law (the US judge's order or EU
               | law) they are complying with.
        
         | lmm wrote:
         | > In theory it is possible to apply (it's mentioned on multiple
         | locations in the documentation), but in practice requests are
         | just being ignored. I get that approval needs to be given, and
         | that there are barriers to entry. But it seems to me they
         | mention zero-data retention only for marketing purposes.
         | 
         | What's the betting that they just write it on the website and
         | never actually implemented it?
        
           | sigmoid10 wrote:
           | Tbf the approach seems pretty standard. Azure also only
           | offers zero retention to vetted customers and otherwise
           | retains data for up to 30 days to monitor and detect abuse.
           | Since the possibilities for abuse are so high with these
           | models, it would make sense that they don't simply give that
           | kind of privilege to everyone - if only to cover their own
           | legal position.
        
         | ArnoVW wrote:
         | My understanding is that they log 30 days by default, for
         | handling of bugs. And that you can request 0 days. This is from
         | their documentation
        
           | lcnPylGDnU4H9OF wrote:
           | > And that you can request 0 days.
           | 
           | Right but the problem they're having is that the request is
           | ignored.
        
         | miles wrote:
         | > I get that approval needs to be given, and that there are
         | barriers to entry.
         | 
         | Why is approval necessary, and what specific barriers (before
         | the latest ruling) prevent privacy and no logging from being
         | the default?
         | 
         | OpenAI's assurances have long been met with skepticism by many,
         | with the assumption that inputs are retained, analyzed, and
         | potentially shared. For those concerned with genuine privacy,
         | local LLMs remain essential.
        
           | AlecSchueler wrote:
           | > what specific barriers (before the latest ruling) prevent
           | privacy and no logging from being the default?
           | 
           | Product development?
        
         | 1vuio0pswjnm7 wrote:
         | "You can also request zero data retention (ZDR) for eligible
         | endpoints if you have a qualifying use-case. For details on
         | data handling, visit our Platform Docs page."
         | 
         | https://openai.com/en-GB/policies/row-privacy-policy/
         | 
         | 1. You can request it but there is no promise the request will
         | be granted.
         | 
         | Defaults matter. Silicon Valley's defaults are not designed for
         | privacy. They are designed for profit. OpenAI's default is
         | retention. Outputs are saved by default.
         | 
         | It is difficult to take the arguments in their memo ISO
         | objection to the preservation order seriously. OpenAI already
         | preserves outputs by default.
        
       | mediumsmart wrote:
       | Its a newspaper. They are sold for a price, not to one person and
       | they dont come with an nda. They become part of history and
       | Society.
        
       | conartist6 wrote:
       | Hey OpenAI! In your "why is this happening" you left some bits
       | out.
       | 
       | You make it sound like they're mad at you for no reason at all.
       | How unreasonable of them when confronted with such honorable
       | folks as yourselves!
        
       | energy123 wrote:
       | > Consumer customers: You control whether your chats are used to
       | help improve ChatGPT within settings, and this order doesn't
       | change that either.
       | 
       | Within "settings"? Is this referring to the dark pattern of
       | providing users with a toggle "Improve model for everyone" that
       | doesn't actually do anything? Instead users must submit a request
       | manually on a hard to discover off-app portal, but this dark
       | pattern has deceived them into think they don't need to look for
       | it.
        
         | sib301 wrote:
         | Can you please elaborate?
        
           | energy123 wrote:
           | To opt-out of your data being trained on, you need to go to
           | https://privacy.openai.com and click the button "Make a
           | Privacy Request".
        
             | alextheparrot wrote:
             | in the app: Settings ~> Data Controls ~> Improve the model
             | for everyone
        
         | curtisblaine wrote:
         | Yes, could you please explain why toggling "Improve model for
         | everyone" off doesn't do anything and provide a link to this
         | off-portal app that you mention?
        
       | jamesgill wrote:
       | Follow the money.
        
       | udev4096 wrote:
       | The irony is palpable here
        
       | hombre_fatal wrote:
       | You know how it's always been a meme that you'd be mortally
       | embarrassed if your browser history ever leaked?
       | 
       | Imagine how much worse it is for your LLM chat history to leak.
       | 
       | It's even worse than your private comms with humans because it's
       | a raw look at how you are when you think you're alone, untempered
       | by social expectations.
        
         | vitaflo wrote:
         | WTF are you asking LLMs and why would you expect any of it to
         | be private?
        
           | ofjcihen wrote:
           | "Write a song in the style of Slipknot about my dumb inbred
           | dogs. I love them very much but they are...reaaaaally dumb."
           | 
           | To be fair the song was intense.
        
           | hombre_fatal wrote:
           | It's not that the convos are necessarily icky.
           | 
           | It's that it's like watching how someone might treat a slave
           | when they think they're alone. And how you might talk down to
           | or up to something that looks like another person. And how
           | pathetic you might act when it's not doing what you want. And
           | what level of questions you outsource to an LLM. And what
           | things you refuse to do yourself. And how petty the tasks
           | might be, like workshopping a stupid twitter comment before
           | you post it. And how you copied that long text from your
           | distraught girlfriend and asked it for some response ideas.
           | etc. etc. etc.
           | 
           | At the very least, I'd wager that it reveals that bit of true
           | helpless patheticness inherent in all of us that we try so
           | hard to hide.
           | 
           | Show me your LLM chat history and I will learn a lot about
           | your personality. Nothing else compares.
        
             | Jackpillar wrote:
             | Might have to reemphasize his question again but - what
             | questions are you asking your LLM? Why are you responding
             | to it and/or "treating" it differently then how you would a
             | calculator or search engine.
        
               | hombre_fatal wrote:
               | Because it's far more capable than a calculator or search
               | engine and because you interact with it with
               | conversational text, it reveals more aspects about your
               | personality.
               | 
               | Why might your search engine queries reveal more about
               | you than your keystrokes in a calculator? Now dial that
               | up.
        
               | Jackpillar wrote:
               | Sure - but I don't interact with it as if its human so my
               | demeanor or attitude is neutral because I'm talking to
               | you know - a computer. Are you getting emotional with and
               | reprimanding your chatbot?
        
               | hombre_fatal wrote:
               | I don't get why I'm receiving pushback here. How you
               | treat the LLM was only a fraction of my examples for ways
               | you can look pathetic if your chats were made public.
               | 
               | You don't reprimand the google search box, yet your
               | search history might still be embarrassing.
        
               | hackinthebochs wrote:
               | Your points were very accurate and relevant. Some people
               | have a serious lack of imagination. The perpetual
               | naysayers will never have their minds changed.
        
               | hombre_fatal wrote:
               | Good god, thank you. I thought I was making an obvious,
               | unanimous point when I wrote that first comment.
        
               | AlecSchueler wrote:
               | It's so tiring to read. You're making a reasonable point.
               | Some people can't believe that other people behave or
               | feel differently to themselves.
        
             | alec_irl wrote:
             | > how you copied that long text from your distraught
             | girlfriend and asked it for some response ideas
             | 
             | good lord, if tech were ethical then there would be
             | mandatory reporting when someone consults an LLM to tell
             | them how they should be responding to their intimate
             | partner. are your skills of expression already that hobbled
             | by chat bots?
        
               | hombre_fatal wrote:
               | These are just concrete examples to get the imagination
               | going, not an exhaustive list of the ways that you are
               | revealing your true self in the folds of your LLM chat
               | history.
               | 
               | Note that it doesn't have to go all the way to "he gets
               | Claude to help him win text arguments with his gf" for an
               | uncomfortable amount of your self to be revealed by the
               | chats.
               | 
               | There is always something icky about someone observing
               | messages you wrote in privacy, and you don't have to have
               | particularly unsavory messages for it to be icky. Why is
               | that?
        
               | alec_irl wrote:
               | i don't personally see messages with an LLM as being
               | different from, say, terminal commands. it's a machine
               | interface. it sounds like you're anthropomorphizing the
               | chat bot, if you're talking to it like you would a human
               | then i would be more worried about the implications that
               | has for you as a person.
        
               | hombre_fatal wrote:
               | Focusing on how you anthropomorphize the LLM isn't really
               | interacting with the point since it was one example.
               | 
               | Might someone's google search history be embarrassing
               | even though they don't treat google like a human?
        
               | AlecSchueler wrote:
               | What does this comment add to the conversation? It feels
               | like an personal attack with no real rebuttal. People
               | with anthropomorphise them all talk to them, the human-
               | like interface is the entire selling point.
        
               | lcnPylGDnU4H9OF wrote:
               | > are your skills of expression already that hobbled by
               | chat bots?
               | 
               | You have it backwards. My skills of expression were
               | hobbled by my upbringing, and others' thoughts on self-
               | expression allowed my skills to flourish. I _wish_ I had
               | a chat bot to help me understand interpersonal
               | communication because I could have actually had good
               | examples growing up.
        
           | threecheese wrote:
           | This product is positioned as a personal copilot, and future
           | iterations (based on leaked plans, may or may not be true) as
           | a wholly integrated life assistant.
           | 
           | Why would a customer expect this not to be private? How can
           | one even know how it could be used against them, when they do
           | t even know what's being collected or gleaned from collected
           | data?
           | 
           | I am following these issues closely, as I am terrified that
           | my "assistant" will some day prevent me from obtaining
           | employment, insurance, medical care etc. And I'm just a non
           | law breaking normie.
           | 
           | A current day example would be TX state authorities using
           | third party social/ad data to identify potentially pregnant
           | women along with ALPR data purchased from a third party to
           | identify any who attempt to have an out of state abortion, so
           | they can be prosecuted. Whatever you think about that law, it
           | is terrifying that a shift in it could find arbitrary digital
           | signals being used against you in this way.
        
       ___________________________________________________________________
       (page generated 2025-06-06 23:01 UTC)