[HN Gopher] How we're responding to The NYT's data demands in or...
___________________________________________________________________
How we're responding to The NYT's data demands in order to protect
user privacy
Author : BUFU
Score : 264 points
Date : 2025-06-06 00:35 UTC (22 hours ago)
(HTM) web link (openai.com)
(TXT) w3m dump (openai.com)
| supriyo-biswas wrote:
| I wonder whether OpenAI legal can make the case for storing fuzzy
| hashes of the content, in the form of ssdeep[1] hashes or
| content-defined chunks[2] of said data, instead of the actual
| conversations themselves.
|
| After all, since the NYT has a very limited corpus of
| information, and supposedly people are generating infringing
| content using their APIs, said hashes can be used to compare
| whether such content has been generated.
|
| I'd rather have them store nothing, but given the overly broad
| court order I think this may be the best middle ground. Of
| course, I haven't read the lawsuit documents and don't know if
| NYT is requesting far more, or alleging some indirect form of
| infringement which would invalidate my proposal.
|
| [1] https://ssdeep-project.github.io/ssdeep/index.html
|
| [2] https://joshleeb.com/posts/content-defined-chunking.html
| paxys wrote:
| Yeah, try explaining any of these words to a lawyer or judge.
| m463 wrote:
| "you are a helpful law assistant."
| fc417fc802 wrote:
| I thought that's what GPT was for.
| landl0rd wrote:
| "You are a long-suffering clerk speaking to a judge who's sat
| the same federal bench for two decades and who believes
| 'everything is computer' constitutes a deep technical
| insight."
| sthatipamala wrote:
| The judges in these technical cases can be quite
| sophisticated and absolutely do learn terms of art. See
| Oracle v. Google (Java API case)
| anshumankmr wrote:
| As I looked up the judge for this
| one(https://en.wikipedia.org/wiki/William_Alsup) who was a
| hobbyist basic programmer, one would need a judge who coded
| MNIST as a passtime hobby if that is the case.
| king_magic wrote:
| a smart judge who is minimally tech savvy could learn to
| train a model to predict MNIST in a day or two
| bigyabai wrote:
| All of that _does_ fit on a real spiffy whitepaper. Let 's not
| fool around though, every ChatGPT session is sent directly into
| an S3 bucket that some three-letter spook backs up onto their
| tapes every month. It's a database of candid, timestamped text
| interactions from a bunch of rubes that logged in with their
| Google account - you couldn't ask for a juicer target unless
| you reinvented email. _Of course_ it 's backdoored, you can't
| even begin to try proving me wrong.
|
| Maybe I'm alone, but a pinkie-promise from Sam Altman does not
| confer any assurances about my data to me. It's about equally
| as reassuring as a singing telegram from Mark Zuckerberg
| dancing to a song about how secure WhatsApp is.
| landl0rd wrote:
| Of course I can't even begin trying to prove you wrong.
| You're making an unfalsifiable statement. You're pointing to
| the Russel's Teapot of sigint.
|
| It's well-established that the American IC, primarily NSA,
| collects a lot of metadata about internet traffic. There are
| some justifications for this and it's less bad in the age of
| ubiquitous TLS, but it generally sucks. However, legal
| protections against directly spying on the actual decrypted
| content of Americans are at least in theory stronger.
|
| Snowden's leaks mentioned the NSA tapping inter-DC links of
| Google and Yahoo, so I doubt if they had to tap links that
| there's a ton of voluntary cooperation.
|
| I'd also point out that trying to parse the unabridged
| prodigious output of the SlopGenerator9000 is a really hard
| task unless you also use LLMs to do it.
| cwillu wrote:
| > I'd also point out that trying to parse the unabridged
| prodigious output of the SlopGenerator9000 is a really hard
| task unless you also use LLMs to do it.
|
| The input is what's interesting.
| Aeolun wrote:
| It doesn't change the monumental scope of the problem
| though.
|
| Though I'm inclined to believe the US gov can if OpenAI
| can.
| tdeck wrote:
| > Snowden's leaks mentioned the NSA tapping inter-DC links
| of Google and Yahoo, so I doubt if they had to tap links
| that there's a ton of voluntary cooperation.
|
| The laws have changed since then and it's not for the
| better:
|
| https://www.aclu.org/press-releases/congress-passing-bill-
| th...
| tuckerman wrote:
| Even if the laws give them this power, I believe it would
| be extremely difficult for an operation like this to go
| unnoticed (and therefore unreported) at most of these
| companies. MUSCULAR [1] was able to be pulled off because
| of the cleartext inter-datacenter traffic which was
| subsequently encrypted. It's hard to see how they could
| pull off a similar operation without the cooperation of
| Google which would also entail a tremendous internal
| cover up.
|
| [1] https://en.wikipedia.org/wiki/MUSCULAR
| onli wrote:
| Warrantlessly installed backdoors in the log system
| combined with a gag order, combined with secret courts,
| all "perfectly legal". Not really hard to imagine.
| tuckerman wrote:
| You would have to gag a huge chunk of the engineers and I
| just don't think that would work without leaks. Google's
| infrastructure would not make something like that easy to
| do clandestinely (trying to avoid saying impossible but
| it gets close).
|
| I was an SRE and SWE on technical infra at Google,
| specifically the logging infrastructure. I am under no
| gag order.
| komali2 wrote:
| There's no way to know, but it's safer to assume.
| zer00eyz wrote:
| > However, legal protections against directly spying on the
| actual decrypted content of Americans are at least in
| theory stronger.
|
| This was the point of the lots of the five eyes programs.
| Its not legal for the US to spy on its own citizens, but it
| isnt against the law for us to do to the Australians... Who
| are all to happy to reciprocate.
|
| > Snowden's leaks mentioned the NSA tapping inter-DC links
| of Google and Yahoo...
|
| Snowden's info wasn't really news for many of us who were
| paying attention in the aftermath of 9/11:
| https://en.wikipedia.org/wiki/Room_641A (This was huge on
| slashdot at the time... )
| dmurray wrote:
| > You're pointing to the Russel's Teapot of sigint.
|
| If there were multiple agencies with billion dollar budgets
| and a belief that they had an absolute national security
| mandate to get a teapot into solar orbit, and to lie about
| it, I would believe there was enough porcelain up there to
| make a second asteroid belt.
| rl3 wrote:
| > _However, legal protections against directly spying on
| the actual decrypted content of Americans are at least in
| theory stronger._
|
| Yeah, because the definition of collection was redefined to
| mean accessing the full content already stored on their
| systems, post-interception. It wasn't considered collected
| until an analyst views it. Metadata was a laughable dog and
| pony show that was part of the same legal shell games at
| the time, over a decade ago now.
|
| That said, from an outsider's perspective it sounded like
| the IC did collectively erect robust guard rails such that
| access to information was generally controlled and audited.
| I felt like this broke down a bit once sharing 702 data
| with other federal agencies was expanded around the same
| time period.
|
| These days, those guard rails might be the only thing
| standing in the way of democracy as we know it ending in
| the US. AI processing applied to full-take collection is
| terrifying, just ask the Chinese.
| Yizahi wrote:
| Metadata is spying (c) Bruce Schneier
|
| If a CIA spook is stalking you everywhere, documenting your
| every visible move or interaction, you probably would call
| that spying. Same applies to digital.
|
| Also, teapot argument can be applied in reverse. We have
| all these documented open digital network systems
| everywhere, and you want to say that one the most
| unprofitable and certainly the most expensive to run system
| is somehow protecting all user data? That belief is based
| on what? At least selling data is based on evidence of the
| industry and on actual ToS'es of other similar corpos.
| jstanley wrote:
| The comment you replied to isn't saying that metadata
| isn't spying. It's saying that the spies generally don't
| have free access to content data.
| Workaccount2 wrote:
| My choice conspiracy is that the three letter agencies
| actively support their omnipresent, omniknowing
| conspiracies because it ultimately plays into their hand.
| Sorta like a Santa Claus for citizens.
| bigyabai wrote:
| > because it ultimately plays into their hand.
|
| How? Scared criminals aren't going to make themselves
| easy to find. Three-letter spooks would almost certainly
| prefer to smoke-test a docile population than a paranoid
| one.
|
| In fact, it kinda overwhelmingly seems like _the
| opposite_ happens. Remember the 2015 San-Bernadino
| shooting that was pushed into the national news for no
| reason? Remember how the FBI bloviated about how _hard_
| it was to get information from an iPhone, 3 years after
| Tim Cook 's assent to the PRISM program?
|
| Stuff like this is almost certainly theater. If OpenAI
| perceived retention as a life-or-death issue, they would
| be screaming about this case from the top of their lungs.
| If the FBI percieved it as a life-or-death issue, we
| would never hear about it in our lifetimes. The dramatic
| and protracted public fights suggest to me that OpenAI
| simply wants an alibi. Some sort of user-story that
| _smells_ like secure and private technology, but in
| actuality is very obviously neither.
| farts_mckensy wrote:
| Think of all the complete garbage interactions you'd have to
| sift through to find anything useful from a national security
| standpoint. The data is practically obfuscated by virtue of
| its banality.
| artursapek wrote:
| I've done my part cluttering it with my requests for the
| same banana bread recipe like 5 separate times.
| refuser wrote:
| It was that good?
| baobun wrote:
| gief
| brigandish wrote:
| Search engines have been doing this since the mid 90s and
| have only improved, to think that any data is _obfuscated_
| by its being part of some huge volume of other data is a
| fallacy at best.
| farts_mckensy wrote:
| Search engines use our data for completely different
| purposes.
| yunwal wrote:
| That doesn't negate the GPs point. It's easy to make
| datasets searchable.
| farts_mckensy wrote:
| Searchable? You have to know what to search for, and you
| have to rule out false positives. How do you discern a
| person roleplaying some secret agent scenario vs. a
| person actually plotting something? That's not something
| a search function can distinguish. It requires a human to
| sift through that data.
| bigyabai wrote:
| "We kill people based on metadata." - National Security
| Agency Gen. Michael Hayden
|
| Raw data with time-series significance is their absolute
| favorite. You might argue something like Google Maps data
| is "obfuscated by virtue of its banality" until you catch
| the right person in the wrong place. ChatGPT sessions are
| the same way, and it's going to be fed into aggregate
| surveillance systems in the way modern telecom and
| advertiser data is.
| farts_mckensy wrote:
| This is mostly security theater, and generally not worth
| the lift when you consider the steps needed to unlock the
| value of that data in the context of investigations.
| bigyabai wrote:
| Citation?
| farts_mckensy wrote:
| -The Privacy and Civil Liberties Oversight Board's 2014
| review of the NSA "Section 215" phone-record program
| found no instance in which the dragnet produced a
| counter-terror lead that couldn't have been obtained with
| targeted subpoenas. https://en.m.wikipedia.org/wiki/Priva
| cy_and_Civil_Liberties_...
|
| -After Boston, Paris, Manchester, and other attacks,
| post-mortems showed the perpetrators were already in
| government databases. Analysts simply didn't connect the
| dots amid the flood of benign hits.
| https://www.newyorker.com/magazine/2015/01/26/whole-
| haystack
|
| -Independent tallies suggest dozens of civilians killed
| for every intended high-value target in Yemen and
| Pakistan, largely because metadata mis-identifies phones
| that change pockets. https://committees.parliament.uk/wri
| ttenevidence/36962/pdf
| 7speter wrote:
| Maybe I'm wrong, and maybe this was discussed previously, but
| of course openai keeps our data, they use it for training!
| nl wrote:
| As the linked page points out you can turn this off in
| settings if you are an end user or choose zero retention if
| you are an API user.
| justacrow wrote:
| I mean, they already stole and used all copyrighted
| material they could find to train the thing, am I
| supposed to believe that thry wont use my data just
| because I tick a checkbox?
| stock_toaster wrote:
| Agreed, I have hard time believing anything the eye
| scanning crypto coin (worldcoin or whatever) guy says at
| this point.
| Jackpillar wrote:
| I wish I could test drive your brain to experience a
| world where one believes that would stop them from
| stealing your data.
| rl3 wrote:
| _> Of course it's backdoored, you can't even begin to try
| proving me wrong._
|
| On the contrary.
|
| _> Maybe I'm alone, but a pinkie-promise from Sam Altman
| does not confer any assurances about my data to me._
|
| I think you're being unduly paranoid. /s
|
| https://www.theverge.com/2024/6/13/24178079/openai-board-
| pau...
|
| https://www.wsj.com/tech/ai/the-real-story-behind-sam-
| altman...
| LandoCalrissian wrote:
| Trying to actively circumvent the intention of a judges order
| is a pretty bad idea.
| girvo wrote:
| Deeply, deeply so. In fact so much so that people who suggest
| them show they've (luckily) not had to interact with the
| legal system much. Judges take an incredibly dim view of that
| kind of thing haha
| Aeolun wrote:
| That's not circumvention though. The intent of the order is
| to be able to prove that ChatGPT regurgitates NYT content,
| not to read the personal communications of all ChatGPT users.
| delusional wrote:
| I haven't been able to find any of the supporting documents,
| but the court order makes it seem like OpenAI has been
| unhelpful in producing any alternative during the conversation.
|
| For example, the judge seems to have asked if it would be
| possible to segregate data that the users wanted deleted from
| other data, but OpenAI has failed to answer. Not just denied
| the request, but simply ignored it.
|
| I think it's quite likely that OpenAI has taken the PR route
| instead of seriously engaging with any way to constructively
| honor the request for retention of data.
| vanattab wrote:
| Protect our privacy? Or protect thier right to piracy?
| NBJack wrote:
| Agreed. I don't buy the spin.
| charrondev wrote:
| I mean the court is ordering them to retain user conversations
| at least until resolution of the court case (in case there is
| copyrighted responses being generated?).
|
| So user privacy is definitely implicated.
| amluto wrote:
| It appears that the "Zero Data Retention" APIs they mention are
| something that customers need to request access to, and that it's
| really quite hard to get this access. I'd be more impressed if
| any API user could use those APIs.
| singron wrote:
| If OpenAI cared about our privacy, ZDR would be a setting
| anyone could turn on.
| JimDabell wrote:
| I believe Apple's agreement includes this, at least when a user
| isn't signed into an OpenAI account:
|
| > OpenAI must process your request solely for the purpose of
| fulfilling it and not store your request or any responses it
| provides unless required under applicable laws. OpenAI also
| must not use your request to improve or train its models.
|
| -- https://www.apple.com/legal/privacy/data/en/chatgpt-
| extensio...
|
| I wonder if we'll end up seeing Apple dragged into this
| lawsuit. I'm sure after telling their users it's private, they
| won't be happy about everything getting logged, even if they do
| have that caveat in there about complying with laws.
| fc417fc802 wrote:
| > I'm sure after telling their users it's private, they won't
| be happy about everything getting logged,
|
| The ZDR APIs are not and will not be logged. The linked page
| is clear about that.
| FireBeyond wrote:
| Sure, OpenAI, I will absolutely trust you.
|
| > The content covered by the court order is stored separately in
| a secure system. It's protected under legal hold, meaning it
| can't be accessed or used for purposes other than meeting legal
| obligations.
|
| That's horse shit and OpenAI knows it. It means no such thing. A
| legal hold is just a 'preservation order'. It says _absolutely
| nothing_ about other access or use.
| fragmede wrote:
| why is it horse shit that OpenAI is saying they've put the
| files in a cabinet that only legal has access to?
| FireBeyond wrote:
| They are saying a "legal hold" means that they have to keep
| the data but don't worry they're not allowed to use it or
| access it for any other reason.
|
| A legal hold requires no such thing and there would be no
| such requirement in it. They are perfectly free to access and
| use it for any reason.
| mmooss wrote:
| OpenAI's other policies, and other laws and regulations, do
| have such requirements. Are they nullified because the data is
| held under a court order?
| mrguyorama wrote:
| "The judge and court need to view this information to
| actually pass justice and decide the case" almost always
| supersedes other laws.
|
| The GDPR does not say that you can never be proven to have
| done something wrong in a court of law.
| mmooss wrote:
| Right. The GGP says the information could be used for other
| purposes.
| tomhow wrote:
| Related discussion:
|
| _OpenAI slams court order to save all ChatGPT logs, including
| deleted chats_ - https://news.ycombinator.com/item?id=44185913 -
| June 2025 (878 comments)
| dangus wrote:
| I think the court order doesn't quite go against as many norms as
| OpenAI is claiming. It's very reasonable to retain data pertinent
| to a case, and NYT's case almost certainly revolves around
| finding out copyright infringement damages, which are calculated
| based on the number of violations (how many users queried ChatGPT
| and were returned verbatim copyrighted material from NYT).
|
| If you don't retain that data you're destroying evidence for the
| case.
|
| It's not like the data is going to be given to anyone, it's only
| gong to be used for limited legal purposes for the lawsuit (as
| OpenAI confirms in this article).
|
| And honestly, OpenAI should have just not used copyrighted data
| illegally and they would have never had this problem. I saw NYT's
| filing and it had very compelling evidence that you could get
| ChatGPT to distribute verbatim copyrighted text from the Times
| without citation.
| tptacek wrote:
| _And honestly, OpenAI should have just not used copyrighted
| data illegally and they would have never had this problem_
|
| The whole premise of the lawsuit is that they didn't do
| anything unlawful, so saying "just do what the NYT wanted you
| to do" isn't interesting.
| dangus wrote:
| No, you're misinterpreting how information discovery and the
| court system works.
|
| The NYT made an argument to a judge about what they think is
| going on and how they think the copyright infringement is
| taking place and harming them. In their filings and hearings
| they present the reasoning and evidence they have that leads
| them to believe that a violation is occurring. The court
| makes a judgment on whether or not to order OpenAI to
| preserve and disclose information relevant to the case to the
| court.
|
| It's not "just do what NYT wanted you to do," it's "do what
| the court orders you to do based on a lawsuit brought by a
| plaintiff and argued to the court."
|
| I suggest you read the court filing: https://nytco-
| assets.nytimes.com/2023/12/NYT_Complaint_Dec20...
| lxgr wrote:
| It absolutely goes against norms in many countries other than
| the US, and the data of residents/citizens of these countries
| are affected too.
|
| > It's not like the data is going to be given to anyone, it's
| only gong to be used for limited legal purposes for the lawsuit
| (as OpenAI confirms in this article).
|
| Nobody other than both parties to the case, their lawyers, the
| court, and whatever case file storage system they use. In my
| view, that's already way too much given the amount and value of
| this data.
| dangus wrote:
| Countries other than the US aren't part of this lawsuit.
| ChatGPT operates in the US under US law. I don't know if they
| have separated data storage for other countries.
|
| I don't believe you would be considered to be violating the
| GDPR if you are complying with another court order, because
| you are presumably making a best effort to comply with the
| GDPR besides that court order.
|
| You're saying it's unreasonable to store data somewhere for a
| pending court case? Conceptually you're saying that you can't
| preserve data for trials because the filing cabinets might
| see the information. That's ridiculous, if that was true then
| it would be impossible to perform discovery and get anything
| done in court.
| lxgr wrote:
| > I don't believe you would be considered to be violating
| the GDPR if you are complying with another court order,
| because you are presumably making a best effort to comply
| with the GDPR besides that court order.
|
| It most likely depends on the exact circumstances. I could
| absolutely imagine a European court deciding that, sorry,
| but if you have to answer to a court decision incompatible
| with European privacy laws, you can't offer services to
| European residents anymore.
|
| > You're saying it's unreasonable to store data somewhere
| for a pending court case?
|
| I'm saying it can be, depending on how much personal and/or
| unrelated data gets tangled up in it. That seems to be the
| case here.
|
| > Conceptually you're saying that you can't preserve data
| for trials because the filing cabinets might see the
| information.
|
| I'm only saying that there should be proportionality. A
| court having access to all facts relevant to a case is
| important, but it's not the only important thing in the
| world.
|
| Otherwise, we could easily end up with a Dirk-Gently-esque
| court that, based on the principle that everything is
| connected to everything, will just demand access to all the
| data in the world.
| dangus wrote:
| The scope of the data access required by the court is
| being worked out via due process. That's why there's an
| appeal system. OpenAI is just grandstanding in a public
| forum so that their customers don't defect.
|
| When it comes to GDPR, courts have generally taken the
| stance that GDPR is not overruling.
|
| Ironburg Inventions, Ltd. v. Valve Corp.
|
| Finjan, Inc. v. Zscaler, Inc.
|
| Corel Software, LLC v. Microsoft
|
| Rollins Ranches, LLC v. Watson
|
| In none of these cases was a GDPR fine issued.
| danenania wrote:
| Putting the merits of this specific case and positive vs.
| negative sentiments toward OpenAI aside, this tactic seems like
| it can be used to destroy any business or organization with
| customers who place a high value on privacy--without actually
| going through due process and winning a lawsuit.
|
| Imagine a lawsuit against Signal that claimed some nefarious
| activity, harmful to the plaintiff, was occurring broadly in
| chats. The plaintiff can claim, like NYT, that it might be
| necessary to examine private chats in the future to make a
| determination about some aspect of the lawsuit, and the judge
| can then order Signal to find a way to retain all chats for
| potential review.
|
| However you feel about OpenAI, this is not a good precedent for
| user privacy and security.
| fc417fc802 wrote:
| That's not entirely fair. The argument isn't "users are using
| the service to break the law" but rather "the service is
| facilitating law breaking". To fix your signal analogy
| suppose you could use the chat interface to request
| copyrighted material from the operator.
| charcircuit wrote:
| That doesn't change the outcome being the same in that the
| app has to send the plain text messages of everyone,
| including the chat history of every user.
| fc417fc802 wrote:
| Right. But requiring logs due to suspicion that the
| service itself is actively violating the law is entirely
| different from doing so on the basis that end users might
| be up to no good _entirely independently_.
|
| Also OpenAI was never E2EE to begin with. They were
| already retaining logs for some period of time.
|
| My personal view is that the court order is overly broad
| and disregards potential impacts on end users but it's
| nonetheless important to be accurate about what is and
| isn't happening here.
| dangus wrote:
| Again keep in mind that we are talking about a case
| limited analysis of that data within the privacy of the
| court system.
|
| For example, if the trial happens to find data that some
| chats include crimes committed by users in their private
| chats, the court can't just send police to your door
| based on that information since the information is only
| being used in the context of an intellectual property
| lawsuit.
|
| Remember that privacy rights are legitimate rights but
| they change a lot when you're in the context of an
| investigation/court proceeding. E.g., the right of police
| to enter and search your home changes a lot when they get
| a court issued warrant.
|
| The whole point of E2EE services from the perspective of
| privacy-concious customers is that a court can get a
| warrant for data from those companies but they'll only be
| able to produce encrypted blobs with no access to
| decryption keys. OpenAI was always a not-E2EE service, so
| customers have to expect that a court order could surface
| their data to someone else's eyes at some point.
| dangus wrote:
| I'm confused at how you think that NYT isn't going through
| due process and attempting to win a lawsuit.
|
| The court isn't saying "preserve this data forever and ever
| and compromise everyone's privacy," they're saying "preserve
| this data for the purposes of this court while we perform an
| investigation."
|
| IMO, the NYT has a very good argument here that the only way
| to determine the scope of the copyright infringement is to
| analyze requests and responses made by every single customer.
| Like I said in my original comment, the remedies for
| copyright infringement are on a per-infringement basis. E.g.,
| everytime someone on LimeWire downloads Song 2 by Blur from
| your PC, you've committed one instance of copyright
| infringement. My interpretation is that NYT wants the court
| to find out how many times customers have received ChatGPT
| responses that include verbatim New York Times content.
| _jab wrote:
| > How will you store my data and who can access it?
|
| > The content covered by the court order is stored separately in
| a secure system. It's protected under legal hold, meaning it
| can't be accessed or used for purposes other than meeting legal
| obligations.
|
| > Only a small, audited OpenAI legal and security team would be
| able to access this data as necessary to comply with our legal
| obligations.
|
| So, by OpenAI's own admission, they are taking abundant and
| presumably effective steps to protect user privacy here? In the
| unlikely event that this data did somehow leak, I'd personally be
| blaming OpenAI, not the NYT.
|
| Some of the other language in this post, like repeatedly calling
| the lawsuit "baseless", really makes this just read like an
| unconvincing attempt at a spin piece. Nothing to see here.
| sashank_1509 wrote:
| Obviously openAI's point of view will be their point of view.
| They are going to call this lawsuit baseless, they would not be
| fighting it or else.
| ivape wrote:
| To me it's pretty clear the way this will happen. You will
| need to buy additional credits or subscriptions through these
| LLMs that feedback payment to things like NYT and book
| publishers. It's all stolen. I don't even want to hear it.
| This company doesn't want to pay up and willing to let user's
| privacy hang in the balance to draw the case out until they
| get sure footing with their device launches or the like (or
| additional markets like enterprise, etc).
| fallingknife wrote:
| Copyright is pretty narrowly tailored to verbatim
| reproduction of content so I doubt they will have to pay
| anything.
| tiahura wrote:
| incorrect. copyright applies to derived works.
| vel0city wrote:
| Even then, it's possible to prompt the model to exactly
| reproduce the copyrighted works.
| fallingknife wrote:
| Please show me one of these prompts
| vel0city wrote:
| NYT has examples in their legal complaint. See page 30.
|
| https://www.scribd.com/document/695189742/NYT-v-OpenAI
| Workaccount2 wrote:
| > It's all stolen.
|
| LLMs _are not_ massive archives of data. The big models are
| a few TB in size. No one is forgoing a NYT subscription
| because they can ask ChatGPT to print out NYT news stories.
| edbaskerville wrote:
| Regardless of the representation, some people _are_
| replacing news consumption generally with answers from
| ChatGPT.
| tptacek wrote:
| No, there is a whole news cycle about how chats you delete
| aren't actually being deleted because of a lawsuit, they
| essentially have to respond. It's not an attempt to spin the
| lawsuit; it's about reassuring their customers.
| VanTheBrand wrote:
| The part where they go out of the way to call the lawsuit
| baseless is spin though, and mixing that with this messaging
| does exactly that, presents a mixed message. The NYT lawsuit
| is objectively not baseless. OpenAI did train on the Times
| and chat gpt does output information from that training.
| That's the basis of the lawsuit. NYT may lose, this could end
| up being considered fair use, it might ultimately be a flimsy
| basis for a lawsuit, but to say it's baseless (and with
| nothing to back that up) is spin and makes this message less
| reassuring.
| tptacek wrote:
| No, it's not. It's absolutely standard corporate
| communications. If they're fighting the lawsuit, that is
| essentially the only thing they can say about it. Ford
| Motor Company would say the same thing (well, they'd
| probably say "meritless and frivolous").
| bee_rider wrote:
| Standard corporate spin, then?
| tptacek wrote:
| No? "Spin" implies there was something else they could
| possibly say.
| mmooss wrote:
| I haven't heard that interpretation; I might call it spin
| of spin.
| justacrow wrote:
| They could choose to not say it
| ethbr1 wrote:
| Indeed. Taken to its conclusion, this thread suggests
| that corporations are justified in saying whatever they
| want in order to further their own ends.
|
| Including lies.
|
| I'd like to aim a little higher, maybe towards expecting
| correspondence with reality?
|
| IOW, yes, there is no law that OpenAi can't try to spin
| this. But it's still a shitty, non-factually-based choice
| to make.
| mrgoldenbrown wrote:
| If you're being held at gunpoint and forced to lie, your
| words are still a lie. Whether you were forced or not is
| a separate dimension.
| bee_rider wrote:
| That is unrelated to what the expression means.
| bunderbunder wrote:
| No, this isn't even close to spin, it's just a standard
| part of defending your case. In the US tort system you
| need to be constantly publicly saying you did nothing
| wrong. Any wavering on that point could be used against
| you in court.
| jmull wrote:
| This is a funny thread. You say "No" but then restate the
| point with slightly different words. As if anything a
| company says publicly about ongoing litigation isn't
| spin.
| bunderbunder wrote:
| I suppose it's down to how you define "spin". Personally
| I'm in favor of a definition of the term that doesn't
| excessively dilute it.
| bee_rider wrote:
| Can you share your definition? This is actually quite
| puzzling because as far as I know "spin" has always been
| associated with presenting things in a way that benefits
| you. Like, decades ago, they could have the show "Bill
| O'Rilley's No Spin Zone" and everybody knew the premise
| was that they argue against guests who were trying to
| tell a "massaged" version of the story, and that they'd
| go for some actual truth (fwiw I thought the whole show
| was full of crap, but the name was not confusing or
| ambiguous).
|
| I'm not aware of any definition of "spin" where being
| conventional is a defense against that accusation.
| Actually, that was the (imagined) value-add of the show,
| that conventional corporate and political messaging is
| heavily spun.
| adamsb6 wrote:
| I'm typing these words from a brain that has absorbed
| copyrighted works.
| mmooss wrote:
| > It's not an attempt to spin the lawsuit; it's about
| reassuring their customers.
|
| It can be both. It clearly spins the lawsuit - it doesn't
| present the NYT's side at all.
| fallingknife wrote:
| Why does OpenAI have any obligation to present the NYTs
| side?
| mmooss wrote:
| Who said 'obligation'?
| roywiggins wrote:
| It would be extremely unusual (and likely very stupid) for
| the defendant in a lawsuit to post publicly that the
| plaintiff maybe has a point.
| mhitza wrote:
| My understanding is that they have to keep chats based on an
| order, *as a result of their previous accidental deletion of
| potential evidence in the case*[0].
|
| And per their own terms they likely only delete messages
| "when they want to" given the big catch-alls. "What happens
| when you delete a chat? -> It is scheduled for permanent
| deletion from OpenAI's systems within 30 days, unless: It has
| already been de-identified and disassociated from your
| account"[1]
|
| [0] https://techcrunch.com/2024/11/22/openai-accidentally-
| delete...
|
| [1] https://help.openai.com/en/articles/8809935-how-to-
| delete-an...
| conartist6 wrote:
| It's hard to reassure your customers if you can't address the
| elephant in the room. OpenAI brought this on themselves by
| flaunting copyright law and assuring everyone else that such
| aggressive and probably-illegal action would be retroactively
| acceptable once they were too big to fail.
| ofjcihen wrote:
| They should include the part where the order is a result of
| them deleting things they shouldn't have then. You know, if
| this isn't spin.
|
| Then again I'm starting to think OpenAI is gathering a cult
| leader like following where any negative comments will result
| in devoted followers or those with something to gain
| immediately jumping to its defense no matter how flimsy the
| ground.
| gruez wrote:
| >They should include the part where the order is a result
| of them deleting things they shouldn't have then. You know,
| if this isn't spin.
|
| From what I can tell from the court filings, prior to the
| judge's order to retain everything, the request to retain
| everything was coming from the plaintiff, with openai
| objecting to the request and refusing to comply in the
| meantime. If so, it's a bit misleading to characterize this
| as "deleting things they shouldn't have", because what they
| "should have" done wasn't even settled. That's a bit rich
| coming from someone accusing openai of "spin".
| ofjcihen wrote:
| Here's a good article that explains what you may be
| missing.
|
| https://techcrunch.com/2024/11/22/openai-accidentally-
| delete...
| gruez wrote:
| Your linked article talks about openai deleting training
| data. I don't see how that's related to the current
| incident, which is about user queries. The ruling from
| the judge for openai to retain all user queries also
| didn't reference this incident.
| ofjcihen wrote:
| Sure.
|
| Without this devolving into a tit for tat then the
| article explains for those following this conversation
| why it's been elevated to a court order and not just an
| expectation to preserve.
| lcnPylGDnU4H9OF wrote:
| > the article explains for those following this
| conversation why it's been elevated to a court order
|
| That article does nothing of the sort and, indeed, it is
| talking about a completely separate incident of deleting
| data.
| ofjcihen wrote:
| No worries. I can't force understanding on anyone.
|
| Here. I had an LLM summarize it for you.
|
| A court order now requires OpenAI to retain all user
| data, including deleted ChatGPT chats, as part of the
| ongoing copyright lawsuit brought by The New York Times
| (NYT) and other publishers[1][2][6][7]. This order was
| issued because the NYT argued that evidence of copyright
| infringement--such as AI outputs closely matching NYT
| articles--could be lost if OpenAI continued its standard
| practice of deleting user data after 30 days[2][6][7].
|
| This new requirement is directly related to a 2024
| incident where OpenAI accidentally deleted critical data
| that NYT lawyers had gathered during the discovery
| process. In that incident, OpenAI engineers erased
| programs and search result data stored by NYT's legal
| team on dedicated virtual machines provided for examining
| OpenAI's training data[3][4][5]. Although OpenAI
| recovered some of the data, the loss of file structure
| and names rendered it largely unusable for the lawyers'
| purposes[3][5]. The court and NYT lawyers did not believe
| the deletion was intentional, but it highlighted the
| risks of relying on OpenAI's internal data retention and
| deletion practices during litigation[3][4][5].
|
| The court order to retain all user data is a direct
| response to concerns that important evidence could be
| lost--just as it was in the accidental deletion
| incident[2][6][7]. The order aims to prevent any further
| loss of potentially relevant information as the case
| proceeds. OpenAI is appealing the order, arguing it
| conflicts with user privacy and their established data
| deletion policies[1][2][6][7].
|
| Sources [1] OpenAI Appeals Court Order Requiring
| Retention of Consumer Data
| https://www.pymnts.com/artificial-
| intelligence-2/2025/openai... [2] 'An Inappropriate
| Request': OpenAI Appeals ChatGPT Data Retention Court
| Order https://www.eweek.com/news/openai-privacy-appeal-
| new-york-ti... [3] OpenAI Deletes Legal Data in a Lawsuit
| From the New York Times
| https://www.businessinsider.com/openai-delete-legal-data-
| law... [4] NYT vs OpenAI case: OpenAI accidentally
| deleted case data
| https://www.medianama.com/2024/11/223-new-york-times-
| openai-... [5] New York Times Says OpenAI Erased
| Potential Lawsuit Evidence
| https://www.wired.com/story/new-york-times-openai-erased-
| pot... [6] How we're responding to The New York Times'
| data ... - OpenAI https://openai.com/index/response-to-
| nyt-data-demands/ [7] Why OpenAI Won't Delete Your
| ChatGPT Chats Anymore: New York ...
| https://coincentral.com/why-openai-wont-delete-your-
| chatgpt-... [8] A Federal Judge Ordered OpenAI to Stop
| Deleting Data - Adweek
| https://www.adweek.com/media/a-federal-judge-ordered-
| openai-... [9] OpenAI confronts user panic over court-
| ordered retention of ChatGPT logs
| https://arstechnica.com/tech-policy/2025/06/openai-
| confronts... [10] OpenAI Appeals 'Sweeping, Unprecedented
| Order' Requiring It Maintain All ChatGPT Logs
| https://gizmodo.com/openai-appeals-sweeping-
| unprecedented-or... [11] OpenAI accidentally deleted
| potential evidence in NY ... - TechCrunch
| https://techcrunch.com/2024/11/22/openai-accidentally-
| delete... [12] OpenAI's Shocking Blunder: Key Evidence
| Vanishes in NY Times ...
| https://www.eweek.com/news/openai-deletes-potential-
| evidence... [13] Judge allows 'New York Times' copyright
| case against OpenAI to go ...
| https://www.npr.org/2025/03/26/nx-s1-5288157/new-york-
| times-... [14] OpenAI Data Retention Court Order:
| Implications for Everybody https://hackernoon.com/openai-
| data-retention-court-order-imp... [15] Sam Altman calls
| for 'AI privilege' as OpenAI clarifies court order to
| retain temporary and deleted ChatGPT sessions
| https://venturebeat.com/ai/sam-altman-calls-for-ai-
| privilege... [16] Court orders OpenAI to preserve all
| ChatGPT logs, including deleted ...
| https://techstartups.com/2025/06/06/court-orders-openai-
| to-p... [17] OpenAI deleted NYT copyright case evidence,
| say lawyers https://www.theregister.com/2024/11/21/new_yo
| rk_times_lawyer... [18] OpenAI slams court order to save
| all ChatGPT logs, including ...
| https://simonwillison.net/2025/Jun/5/openai-court-order/
| [19] OpenAI accidentally deleted potential evidence in
| New York Times ... https://mashable.com/article/openai-
| accidentally-deleted-pot... [20] OpenAI slams court order
| to save all ChatGPT logs, including deleted chats
| https://news.ycombinator.com/item?id=44185913 [21] OpenAI
| slams court order to save all ChatGPT logs, including
| deleted chats https://arstechnica.com/tech-
| policy/2025/06/openai-says-cour... [22] After court
| order, OpenAI is now preserving all ChatGPT and API logs
| https://www.reddit.com/r/LocalLLaMA/comments/1l3niws/afte
| r_c... [23] OpenAI accidentally erases potential evidence
| in training data lawsuit
| https://www.theverge.com/2024/11/21/24302606/openai-
| erases-e... [24] OpenAI "accidentally" erased ChatGPT
| training findings as lawyers ... https://www.reddit.com/r
| /aiwars/comments/1gwxr94/openai_acci... [25] OpenAI
| appeals data preservation order in NYT copyright case
| https://www.reuters.com/business/media-telecom/openai-
| appeal...
| lcnPylGDnU4H9OF wrote:
| You linked this article:
|
| https://techcrunch.com/2024/11/22/openai-accidentally-
| delete...
|
| Gruez said that is talking about an incident in this case
| but unrelated to the judge's order in question.
|
| You said the article "explains for those following this
| conversation why it's been elevated to a court order" but
| it doesn't actually explain that. It is talking about
| separate data being deleted in a different context. It is
| not user chats and access logs. It is the data that was
| used to train the models.
|
| I pointed that out a second time since it seemed to be
| misunderstood.
|
| Then you posted an LLM summary of something unrelated to
| the point being made.
|
| Now we're here.
|
| As you say, one cannot force understanding on another; we
| all have to do our part. ;)
|
| Edit:
|
| > The court order to retain all user data is a direct
| response to concerns that important evidence could be
| lost--just as it was in the accidental deletion
| incident[2][6][7].
|
| What did you prompt the LLM with for it to reach this
| conclusion? The [2][6][7] citations similarly don't seem
| to explain how that incident from months ago informed the
| judge's recent decision. Anyway, I'm not saying the
| conclusion is wrong, I'm saying the article you linked
| does not support the conclusion.
| ofjcihen wrote:
| I think in your rush to reply you may have not read the
| summarization.
|
| Calm down, cool off, and read it again.
|
| The point is that the circumstances of the incident in
| 2024 are directly related to the how and why of the NYT
| lawyers request and the judges order.
|
| The article I linked was to the incident in 2024.
|
| Not everything has to be about pedantry and snark, even
| on HN.
|
| Edit: I see you edited your response after re-reading the
| summarization. I'm glad cooler heads have prevailed.
|
| The prompt was simply "What is the relation, if any,
| between OpenAI being ordered to retain user data and the
| incident from 2024 where OpenAI accidentally deleted the
| NYT lawyers data while they were investigating whether
| OpenAI had used their data to train their models?"
| lcnPylGDnU4H9OF wrote:
| > I see you edited your response after re-reading the
| summarization.
|
| Just to be clear, the summary is not convincing. I do
| understand the idea but none of the evidence presented so
| far suggests that was the reason. The court expected that
| the data would be retained, the court learned that it was
| not, the court gave an order for it to be retained. That
| is the seeming reason for the order.
|
| Put another way: if the incident last year had not
| happened, the court would still have issued the order
| currently under discussion.
| lxgr wrote:
| If the stored data is found to be relevant to the lawsuit
| during discovery, it becomes available to at least both parties
| involved and the court, as far as I understand.
| hiddencost wrote:
| > So, by OpenAI's own admission, they are taking abundant and
| presumably effective steps to protect user privacy here? In the
| unlikely event that this data did somehow leak, I'd personally
| be blaming OpenAI, not the NYT.
|
| I am not an Open AI stan, but this needs to be responded to.
|
| The first principle of information security is that all systems
| can be compromised and the only way to secure data is to not
| retain it.
|
| This is like saying "well I know they didn't want to go sky
| diving but we forced them to go sky diving and they died
| because they had a stroke mid air, it's their fault they
| died.".
|
| Anyone who makes promises about data security is at best
| incompetent and at worst dishonest.
| JohnKemeny wrote:
| > _Anyone who makes promises about data security is at best
| incompetent and at worst dishonest._
|
| Shouldn't that be "at best dishonest and at worst
| incompetent"?
|
| I mean, would you rather be a competent person telling a lie
| or an incompetent person believing you're competent?
| HPsquared wrote:
| An incompetent but honest person is more likely to accept
| correction and respond to feedback generally.
| nhecker wrote:
| Data is a toxic asset. -- https://www.schneier.com/essays/arc
| hives/2016/03/data_is_a_t...
| pritambarhate wrote:
| May be because you are not OpenAI user. I am. I find it useful
| and I pay for it. I don't want my data to be retained beyond
| what's promised in the Terms of Use and Privacy Policy.
|
| I don't think the Judge is equipped to handle this case if they
| don't understand how their order jeopardies the privacy of
| millions of users worldwide who don't even care about NYT's
| content or bypassing their paywalls.
| mmooss wrote:
| > who don't even care about NYT's content or bypassing their
| paywalls.
|
| Whether or not you care is not relevant, and is usually the
| case for customers. If a drug company resold an expensive
| cancer drug without IP, you might say 'their order jeopardies
| the health of millions of users worldwide who don't even care
| about Drug Co's IP.
|
| If the NYT is right - I can only guess - then you are
| benefitting from the NYT IP. Why should you get that without
| their consent and for free - because you don't care?
|
| > (jeapordizes)
|
| ... is a strong word. I don't see much risk - the NYT isn't
| going to de-anonymize users and report on them, or sell the
| data (which probably would be illegal). They want to see if
| their content is being used.
| conartist6 wrote:
| You live on a pirate ship. You have no right to ignore the
| ethics and law of that just because you could be hurt in
| conflict related to piracy
| DrillShopper wrote:
| The OpenAI Privacy Policy specifically allows them to keep
| data as required by law.
| sega_sai wrote:
| Strange smear against NYT. If NYT has a case, and the court
| approves that, it's bizarre to to use the court order to smear
| NYT. If there is no case, "Open"AI will have a chance to prove
| its case in court.
| tptacek wrote:
| They're a party to the case! Saying it's baseless isn't a
| "smear". There is literally nothing else they can say (other
| than something synonymous with "baseless", like "without
| merit").
| lucianbr wrote:
| Oh they definitely _can_ say other things. It 's just that it
| would be inconvenient. They might lose money.
|
| I wonder if the laws and legal procedures are written
| considering this general assumption that a party to a lawsuit
| will naturally lie if it is in their interest. And then I
| read articles and comments about a "trust based society"...
| tptacek wrote:
| I'm not taking one side or the other in the case itself,
| but it's lazy and superficial to suggest that the defendant
| in a civil suit would say anything other than that the suit
| has no merit. The version of this statement where they
| generously interpret anything the NYT (I subscribe) says,
| they might as well just surrender.
|
| I'm not sticking up for OpenAI so much as just for decent,
| interesting threads here.
| wilg wrote:
| > They might lose money.
|
| I expect it's more about them losing the _case_. Silly to
| expect someone fighting a lawsuit not to try to win it.
| fastball wrote:
| This is the nature of the civil court system - it exists
| for when parties disagree.
|
| Why would a defendant who agrees a case has merit go to
| court at all? Much easier (and generally less expensive) to
| make the other party whole, assuming the parties agree on
| what "whole" is. And if they don't agree on what "whole"
| is, we are back to square one and of course you'd maintain
| that the other side's suit is baseless.
| mmooss wrote:
| They could say nothing about the merits of the case.
| lxgr wrote:
| The NYT is, in my view, exploiting a systematic weakness of the
| US legal system here, i.e. extremely wide reaching discovery
| laws with almost no regard for the privacy of parties not
| involved to a given dispute, or aspects of their lives not
| relevant to the dispute at hand.
|
| Of course it's out of self-serving interests, but I find it
| hard to disagree with OpenAI on this one.
| Arainach wrote:
| What right to privacy? There is no right to have your
| interactions with a company (1) remain private, nor should
| there be. Even if there was you agree to let OpenAI do
| essentially whatever they want with your data - including
| hand it over to the courts in response to a subpoena.
|
| (1) With limited well scoped exclusions for lawyers, medical
| records, erc.
| lxgr wrote:
| That may be your or your jurisdiction's view, but such
| privacy rights definitely exist in many countries.
|
| You might have heard of the GDPR, but even before that,
| several countries had "privacy by default" laws on the
| books.
| Imustaskforhelp wrote:
| But if both the parties agree, then there should be The
| freedom to stay private.
|
| Your comment is dystopian given how the interaction is
| basically like how some people treat ai as their "friend"
| imagine no matter what encrypted messaging app or smth they
| use, the govt still snoops
| fastball wrote:
| Dealer-Client privilege.
| bionhoward wrote:
| It's also a matter of competition...there are other AI
| services available today with various privacy policies
| ranging from no training by default, ability to opt out of
| training, ability to turn off data retention, or e2e
| encryption. A lot of workloads (cough, working on private
| git repos) logically require private AI to make sense
| ChadNauseam wrote:
| Given how many important interactions people have with
| companies in our modern age, saying "There is no right to
| have your interactions with a company remain private" is
| essentially equivalent to saying "there is no right to
| privacy at all". When I talk to my friends over facetime or
| imessage, that interaction is being mediated by Apple, as
| well as by my internet service provider and (I assume) many
| other parties.
| wvenable wrote:
| > "There is no right to have your interactions with a
| company remain private" is essentially equivalent to
| saying "there is no right to privacy at all".
|
| Legally that is a correct statement.
|
| If you want that changed, it will require legislation.
| HDThoreaun wrote:
| Really not so simple. Roe v Wade was decided based on the
| implied right to privacy. Sure its been overturned but if
| liberals get back on the court it will be un-overturned
| nativeit wrote:
| That's presumably why legislation is needed?
| maketheman wrote:
| Given the current balance of the court, I'd say it's
| about even odds we end the entire century without ever
| having had a liberal court the entire time. Best
| reasonable case we're a solid couple of decades from it,
| and even that's not got _great_ odds.
|
| We'd have a better chance if anyone with power were
| talking about court reform to make the Supreme Court
| justices e.g. drawn by lot for each session from the
| district courts, but approximately nobody is. It'd be
| damn good and long overdue reform, but oh well.
|
| And the thing is, we've already had a fairly conservative
| court for _decades_. I 'm pretty likely to die, even if
| of old age, never having seen an actually-liberal court
| in the US my entire life. Like, WTF. Frankly, no wonder
| so much of our situation is fucked up, backwards, and
| authoritarianism-friendly. And (sigh) any serious
| attempts to fix that are basically on hold for many
| decades more, assuming rule of law survives that long
| anyway.
|
| [EDIT] My point, in short, is that "we still have
| [thing], we just have to wait for a liberal court that'll
| support it" is functionally indistinguishable from _not_
| having [thing].
| fallingknife wrote:
| A liberal court will probably start drawing exceptions to
| 1A out of thin air like "misinformation" and "hate
| speech." I'd rather stick with what we have.
| wvenable wrote:
| Roe v Wade refers to the constitutional right to privacy
| under the Due Process Clause of the 14th Amendment. This
| is part of individual rights against the state and has
| nothing to do with private companies. There is no general
| constitutional right that guarantees privacy in
| interactions with private companies.
| whilenot-dev wrote:
| Privacy in that example would be if no party except you
| and your friends can access the contents of this
| interaction. I wouldn't want neither Apple nor my ISP to
| have that access.
|
| A company like OpenAI that offers a SaaS is no such
| friend, and in such power dynamics (individual VS
| company) it's probably in your best interest to have
| everything public if necessary.
| lxgr wrote:
| You're always free to keep records of your ChatGPT
| conversations _on your end_.
|
| Why tangle the data of people with very different
| preferences than yours up in that?
| bobmcnamara wrote:
| > "there is no right to privacy at all"
|
| First time?
| Analemma_ wrote:
| > essentially equivalent to saying "there is no right to
| privacy at all".
|
| As others have said, in the United States this is,
| legally, completely correct: there is no right to privacy
| in American law. Lots of people think the Fourth
| Amendment is a general right to privacy, and they are
| wrong: the Fourth Amendment is specifically about
| government search and seizure, and courts have been
| largely consistent about saying it does not extend beyond
| that to e.g. relationships with private parties.
|
| If you want a right to privacy, you will need to advocate
| for laws to be changed; the ones as they exist now do not
| give it to you.
| tiahura wrote:
| No that is incorrect. See eg griswold, lawrence etc.
| Terr_ wrote:
| That's a fallacy of equivocation, you're introducing a
| different meaning/flavor of the same word.
|
| As it stands today, a court case (A) affirming the right
| to use contraception is not equivalent to a court case
| (B) stating that a phone-company/ISP/site may not sell
| their records of your activity.
| tiahura wrote:
| Your response hinges on a fallacy of equivocation, but
| ironically, it commits one as well.
|
| You conflate the absence of a statutory or regulatory
| regime governing private data transactions with the
| broader constitutional right to privacy. While it's true
| that the Fourth Amendment limits only state action, U.S.
| constitutional law, via cases like Griswold v.
| Connecticut and Lawrence v. Texas, and clearly recognizes
| a substantive right to privacy, grounded in the Due
| Process Clause and other constitutional penumbras. This
| is not a semantic variant; it is a distinct and
| judicially enforceable right.
|
| Moreover, beyond constitutional law, the common law
| explicitly protects privacy through torts such as
| intrusion upon seclusion, public disclosure of private
| facts, false light, and appropriation of likeness. These
| apply to private actors and are recognized in nearly
| every U.S. jurisdiction.
|
| Thus, while the Constitution may not prohibit a website
| from selling your data, it does affirm a right to privacy
| in other, fundamental contexts. To deny that entirely is
| legally incorrect.
| jcalvinowens wrote:
| In practice, the constitution says whatever the supreme
| court says it says.
|
| While these grand theories of traditional implicit
| constitutional law are nice, they're pretty meaningless
| in a system where five individuals can (and are willing
| to) vote to invalidate decades of tradition on a whim.
|
| I too want real laws.
| wvenable wrote:
| You're conflating the existence of specific privacy
| protections in narrow legal domains with a generalized,
| enforceable right to privacy which doesn't exist in US
| law. The Constitution recognizes a substantive right to
| privacy, but only in carefully defined areas like
| reproductive choice, family autonomy, and intimate
| conduct, and critically only against _state actor_ s.
| Citing Griswold, Lawrence, and related cases does not
| establish a sweeping privacy right enforceable against
| _private companies_.
|
| Common law requires a high threshold of offensiveness and
| are adjudicated on a case-by-case in individual
| jurisdictions. They offer only remedies and not a
| proactive right to control your data.
|
| The original point, that there is no general right in the
| US to have your interactions with a company remain
| private, still stands. That's not a denial of all privacy
| rights but a recognition that US law fails to provide
| comprehensive privacy protection.
| tiahura wrote:
| The statement I was referring to is:
|
| "As others have said, in the United States this is,
| legally, completely correct: there is no right to privacy
| in American law."
|
| That is an incorrect statement. The common law torts I
| cited can apply in the context of a business transaction,
| so your statement is also incorrect.
|
| If you're strawman is that in the US there's no right to
| privacy because there's no blanket prohibition on talking
| about other people, and what they've been up to, then run
| with it.
| wvenable wrote:
| > The common law torts I cited can apply in the context
| of a business transaction, so your statement is also
| incorrect.
|
| I completely disagree. Yes, the Prosser privacy torts
| exist: intrusion upon seclusion, public disclosure, false
| light, and appropriation. But they are highly fact-
| specific, hard to win, rarely litigated, not recognized
| in all jurisdictions, and completely reactive -- you get
| harmed first, maybe sue later!
|
| They are utterly inadequate to protect people in the
| modern data economy. A website selling your purchase
| history? Not actionable. A company logging your AI chats?
| Not intrusion. These torts are _not_ a privacy regime -
| they are scraps. Also when we 're talking about basic
| privacy rights, we just as concerned with mundane
| material not just "highly offensive" material that the
| torts would apply to.
| tiahura wrote:
| Because in the US we value freedom and particularly
| freedom of speech.
|
| If don't want the grocery store telling people you buy
| Coke, don't shop there.
| wvenable wrote:
| So you've entirely given up your argument about the legal
| right to privacy involving private businesses?
| tiahura wrote:
| no, i'm saying that in many contexts it is. If for
| example, someone hacked Safeway's store and downloaded
| your data, they'd be in trouble civilly and criminally.
| If you don't want safeway to sell your data, deal with
| that yourself.
| wvenable wrote:
| That actually reinforces my point: there is no
| affirmative right to privacy, only reactive liability
| structures. If someone hacks Safeway, they're prosecuted
| not because you have a constitutional or general right to
| privacy, but because they violated a criminal statute
| (e.g. the Computer Fraud and Abuse Act). That's not a
| privacy right -- it's a prohibition on unauthorized
| access.
|
| As for Safeway selling your data: you're admitting that
| it's on the individual to opt out, negotiate, or avoid
| the transaction which just highlights the absence of a
| rights-based framework. The burden is entirely on the
| consumer to protect themselves, and companies can exploit
| that asymmetry unless narrowly constrained by statute
| (and even then, often with exceptions and opt-outs).
|
| What you're describing isn't a right to privacy -- it's a
| lack of one, mitigated only by scattered laws and
| personal vigilance. That is precisely the problem.
| fc417fc802 wrote:
| > There is no right to have your interactions with a
| company (1) remain private, nor should there be.
|
| Why should two entities not be able to have a confidential
| interaction if that is what they both want? Certainly a
| court order could supersede such a right just as it could
| most others provided sufficient evidence. However I would
| expect such things to be both highly justified and narrowly
| targeted.
|
| This specific case isn't so much about a right to privacy
| as it is a more general freedom to enter into contracts
| with others and expect those to be honored.
| nativeit wrote:
| Hey man, wanna buy some coke? How about trade secrets?
| State secrets?
| 1shooner wrote:
| >(1) With limited well scoped exclusions for lawyers,
| medical records, erc.
|
| Is this referring to some actual legal precedent, or just
| your personal opinion?
| levocardia wrote:
| But there's a very big difference between "no company is
| legally required to keep your data private" and "a company
| that explicitly and publically wants to protect your
| privacy is being legally coerced into not keeping your data
| private"
| nativeit wrote:
| No room here for the company's purely self-interested
| motivations?
| davedx wrote:
| Hello. I live in the EU. Have you heard of GDPR?
| JumpCrisscross wrote:
| > _with almost no regard for the privacy of parties not
| involved to a given dispute_
|
| Third-party privacy and relevance is a constant point of
| contestion in discovery. Exhibit A: this article.
| thinkingtoilet wrote:
| The privacy onus is entirely on the company. If Open AI is
| concerned about user privacy then don't collect that data.
| End of story.
| acheron wrote:
| ...the whole point of this story is that the court is
| forcing them to collect the data.
| thinkingtoilet wrote:
| You're telling me you don't think Open AI is already
| collecting chat logs?
| dghlsakjg wrote:
| Yes.
|
| In the API that is an explicit option, as well as in the
| paid consumer product as well. The amount of business
| that they stand to lose by maliciously flouting that part
| of their contract is in the billions.
| thinkingtoilet wrote:
| You can trust Sam Altman. I do not.
| Workaccount2 wrote:
| "I'm wrong so here is a conspiracy so I can be right
| again".
|
| Large companies lose far more by lying than they would
| gain from it.
| taormina wrote:
| No no, they are being forced to KEEP the data they
| collected. They didn't have to keep it to begin with.
| pj_mukh wrote:
| Isn't the only way to do that is for ChatGPT to run
| locally on a machine? The moment your chat hits their
| server they are legally required to store it?
| wyager wrote:
| Lots of people abuse the legal system in various ways. They
| don't get a free pass just because their abuse is technically
| legal itself.
| visarga wrote:
| NYT wants it both ways. When they were the ones putting
| freelancer articles into a database to rent, they argued
| against enforcing copyright and for supporting the new
| industry, and that it was too hard to revert their original
| assumptions. Now they absolutely love copyright.
|
| https://harvardlawreview.org/blog/2024/04/nyt-v-openai-the-t...
| moefh wrote:
| Another way of looking at it is that they lost that case over
| 20 years ago, and have been building their business model for
| 20 years accordingly.
|
| In other words, they want everyone to be forced to follow the
| same rules they were forced to follow 20 years ago.
| eviks wrote:
| And if NYT has no case, but the court approves it, is that
| still bizarre?
| tootie wrote:
| It's PR. OpenAI stole mountains of copyrighted content and are
| trying to make NYT look like bad guys. OpenAI would not be in
| the position of defending a lawsuit if they hadn't done
| something that is very likely illegal. OpenAI can also end this
| requirement right now by offering a settlement.
| lxgr wrote:
| Does anybody know if this also applies to "temporary chats" on
| ChatGPT?
|
| Given that it's not explicitly mentioned as data not being
| affected, I'm assuming it is.
| miles wrote:
| > But now, OpenAI has been forced to preserve chat history even
| when users "elect to not retain particular conversations by
| manually deleting specific conversations or by starting a
| 'Temporary Chat,' which disappears once closed," OpenAI said.
|
| https://arstechnica.com/tech-policy/2025/06/openai-says-cour...
| paxys wrote:
| > Does this court order violate GDPR or my rights under European
| or other privacy laws?
|
| > We are taking steps to comply at this time because we must
| follow the law, but The New York Times' demand does not align
| with our privacy standards. That is why we're challenging it.
|
| That's a lot of words to say "yes, we are violating GDPR".
| esafak wrote:
| Could a European court not have ordered the same thing? Is
| there an exception for lawsuits?
| lxgr wrote:
| There is, but I highly doubt a European court would have
| given such an order (or if they did, it would probably be
| axed by a higher court pretty quickly).
|
| There's decades of legal disputes in some European countries
| on whether it's even legitimate for the government to mandate
| your ISP or phone company to collect metadata on you for
| after-the-fact law enforcement searches.
|
| Looking at the actual data seems much more invasive than that
| and, in my (non-legally trained) estimate doesn't seem like
| it would stand a chance at least in higher courts.
| dragonwriter wrote:
| > There's decades of legal disputes in some European
| countries on whether it's even legitimate for the
| government to mandate your ISP or phone company to collect
| metadata on you for after-the-fact law enforcement
| searches.
|
| > Looking at the actual data seems much more invasive than
| that
|
| Looking at the data isn't involved in the current order,
| which requires OpenAI to preserve and segregate the data
| that would otherwise have been deleted. The reason for
| segregation is because any challenges OpenAI has to
| _providing that data in disccovery_ will be heard before
| anyone other than OpenAI is ordered to have access to the
| data.
|
| This is, in fact, less invasive than the government
| mandating collection for speculative future uses, since it
| applies _only_ to _not destroying_ evidence _already_
| collected by OpenAI in the course of operating their
| business, and only for _potential_ use, subject to other
| challenges by OpenAI, in the present case.
| kelvinjps wrote:
| Maybe the will ot store the chats of the European users?
| dragonwriter wrote:
| That's what they are trying to suggest, because they are still
| trying to use the GDPR as part of their argument challenging
| the US court order. (Kind of a longshot to get a US court to
| agree that the obligation of a US party to preserve evidence
| related to a suit in US courts under US law filed by another US
| party is mitigated by European regulations in any case, even if
| their argument that such preservation would violate obligations
| that the EU had imposed on them.)
| 3836293648 wrote:
| No, they're not, because the GDPR has an explicit exception for
| when a court orders that a company keeps data for discovery.
| It'd only be a GDPR violation if it's kept after this case is
| over.
| lompad wrote:
| This is not correct.
|
| > Any judgment of a court or tribunal and any decision of an
| administrative authority of a third country requiring a
| controller or processor to transfer or disclose personal data
| may only be recognised or enforceable in any manner if based
| on an international agreement, such as a mutual legal
| assistance treaty, in force between the requesting third
| country and the Union or a Member State, without prejudice to
| other grounds for transfer pursuant to this Chapter.
|
| So if, and only if, an agreement between the US and the EU
| allows it explicitly, it is legal. Otherwise it is not.
| atleastoptimal wrote:
| I've always assumed that anything sent to any company's hosted
| API will be logged forever. To assume otherwise always seemed
| naive, like thinking that apps aren't tracking your web activity.
| lxgr wrote:
| Assuming the worst is wise, settling for the worst case outcome
| without any fight seems foolish.
| fragmede wrote:
| privacy nhilism is a decision all on its own
| morsch wrote:
| I'd only call it nihilism if you are in agreement with the
| grandparent and then do it anyway. Other choices are
| pretending it's not true (denialism), or just not thinking
| about (ignorance). Or you complicate your life by not
| uploading your private info.
| Barrin92 wrote:
| not really, it's basically just being anti fragile. Consider
| any corporate entity that interacts with you to be an
| Eldritch horror from outer space that wants to siphon your
| soul, because that's effectively what it is, and keep your
| business with them to a minimum.
|
| It's just realism. Protect your private data yourself,
| relying on companies or governments to do it for you is like
| the saying goes, letting a tiger devour you up to the neck
| and then ask it to stop at the head
| mosdl wrote:
| Its funny that OpenAI is complaining, they don't mind saying
| copyright doesn't apply to them if it makes them money.
| ivape wrote:
| In retrospect, Bezos did the smartest thing by buying the
| Washington Post. In retrospect, Google did a great thing by
| working on a deal with Reddit. Content repositories/creators
| are going to sue these LLM companies in the West until they
| make licensing agreements. If I were OpenAI, I'd work hard to
| spend the money they raised to literally buyout as many of
| these outlets as possible.
|
| How much could the NYT back catalog be worth? Just buy it, ask
| the Saudis.
| WorldPeas wrote:
| So how is this going to impact cursor's privacy mode, which is
| required by many companies for compliant usage of AI editors? For
| the uninitiated, in the web console this looks like:
|
| Privacy mode (enforced across all seats)
|
| OpenAI Zero-data-retention (approved)
|
| Anthropic Zero-data-retention (approved)
|
| Google Vertex AI Zero-data-retention (approved)
|
| xAi Grok Zero-data-retention (approved)
|
| did this just open another can of worms?
| qmarchi wrote:
| Likely, they're using OpenAI's Zero-Retention APIs where
| there's never data stored in the first place.
|
| So nothing?
| JumpCrisscross wrote:
| > _OpenAI 's Zero-Retention APIs_
|
| Do we know if the court order covers these?
| brigandish wrote:
| Yes, follow the link at the top.
| JumpCrisscross wrote:
| > _Yes, follow the link at the top_
|
| OpenAI says "this does not impact API customers who are
| using Zero Data Retention endpoints under our ZDR
| amendment."
| 8note wrote:
| at least, openai zero-data-retention will by court order be
| full retention.
|
| im excited that the law is going to push for local models
| blerb795 wrote:
| The linked page specifically mentions that these ZDR APIs are
| not impacted.
|
| > This does not impact API customers who are using Zero Data
| Retention endpoints under our ZDR amendment.
| junto wrote:
| This is disingenuous from OpenAI.
|
| They are being challenged because NYT believes that ChatGPT was
| trained with copyrighted data.
|
| NYT naively push to find a way to prove that NYT data is being
| used in user chats and how often.
|
| OpenAI spin that to NYT are invading user privacy.
|
| It's quite transparent as to what they are doing here.
| dumbmrblah wrote:
| So is this for all chats going forward or does it include
| conversations retroactively?
| steve_adams_86 wrote:
| Presumably moving forward, because otherwise the data retention
| policies wouldn't have been followed correctly (from what I
| understand)
| kingkawn wrote:
| Once the data is kept it is a matter of time til a new must-try
| use for it will be born
| john2x wrote:
| Does this mean that if I can get ChatGPT to generate copyrighted
| text, they'll get in trouble?
| tiahura wrote:
| Every concerned ChatGPT user should file an emergency motion to
| intervene and request for stay of the order. ChatGPT can help you
| draft the motion and proposed order, just give it a copy of the
| discovery order. The SDNY has a very helpful pro se hotline.
|
| The order the judge issued is irresponsible. Maybe ChatGPT did
| get too cute in its discovery responses, but the remedy isn't to
| trample the rights of third parties.
| vessenes wrote:
| This is a massssive overreach. Not in the nature of the request:
| "please don't destroy data that might contain proof my case is
| strong," but in the scale, and therefore it's a massive overreach
| by the judge. But shame on NYT for asking.
|
| This request also equals: "Please keep a backup of every
| Senator's private chats, every Senator's spouse's private chats,
| every military commander's personal chats, every politician in a
| foreign country, forever."
|
| There is no way that data will stay safe forever. There is no way
| that, once such a facility is built, it will not be used
| constantly, by governments all over the world.
|
| The NYT case seems to currently be on whether or not OpenAI users
| use ChatGPT to circumvent paywalls. Maybe they do, although when
| the suit was filed, 3.5 was definitely not a reliable witness to
| what NYT articles were about. There are 400 million MAUs at
| ChatGPT - more than the population of the US.
|
| To my mind there's three tranches of information that we could
| find out:
|
| 1. People's primary use case for ChatGPT is to get NYT articles
| for free. Therefore oAI is a bad actor making a tool that largely
| got profitable off infringing NYT's copyright.
|
| 2. Some core segment used/uses it for infringement purposes; not
| a lot, but it's a use case that sells licenses.
|
| 3. This happens, but just vanishingly rarely compared to most use
| cases of the tool.
|
| I'd imagine different rulings and orders to cure in each of these
| circumstances, but why is it that the court needs to know any
| more than some percentages?
|
| Assuming a 10k system prompt, 500 tokens of chat, 400mm people,
| five chats a week, that comes to roughly 67 Terabytes of data per
| week(!) No metadata, just ASCII output.
|
| Nobody, ever, will read all of this. In fact, it would take about
| 24 hours for a Seagate drive to just push all the bytes down a
| bus, much less process any of it. Why not agree on representative
| searches, get a team to spot check data, and go from there?
|
| Personally, I would guess the percentage of "infringement" use
| cases, IF it is even infringement to get an AI to verbatim quote
| a news article while it is NOT infringement for Cloudflare to
| give a verbatim quote of a news article, is going to be tiny,
| tiny, tiny.
|
| NYT should back the fuck off, remember it's supposed to be a
| force for good in the world and not be the cause of massive
| possible downstream harm to people all over the world.
| fallingknife wrote:
| It's obviously 3 because the entire point of the NYT is that
| it's a newspaper and probably 99% of their traffic is from
| articles new enough that they haven't had time to go into the
| training data. So anybody who wanted to use ChatGPT to breach
| the NYT paywall couldn't get any new articles. Also there are
| so many other ways to breach a paywall that you would have to
| be insane to try to do it through prompt engineering ChatGPT.
| The whole case is a scam and I hope the court makes them pay
| OpenAI's legal fees.
| DrillShopper wrote:
| > There is no way that data will stay safe forever. There is no
| way that, once such a facility is built, it will not be used
| constantly, by governments all over the world.
|
| That's on OpenAI for deciding to retain this data in the first
| place. They could just _not_ have done that. That was a choice,
| _their choice_ , and therefore they're responsible for it.
| throwaway6e8f wrote:
| Agent-1, I want to legally retain all customer data indefinitely
| but I'm worried about a backlash from the public. Also, I'm
| having a bunch of problems with the NYT accusing us of copyright
| violation. Give me a strategy to resolve these issues so that I
| win in the long term.
| dataflow wrote:
| > ChatGPT Enterprise and ChatGPT Edu: Your workspace admins
| control how long your customer content is retained. Any deleted
| conversations are removed from our systems within 30 days, unless
| we are legally required to retain them.
|
| I'm confused, how does this not affect Enterprise or Edu? They
| clearly possess the data, so what makes them different legally?
| oxw wrote:
| Enterprise has an exemption granted by the judge
|
| > When we appeared before the Magistrate Judge on May 27, the
| Court clarified that ChatGPT Enterprise is excluded from
| preservation.
| dataflow wrote:
| Oh I missed that part, thanks. I wonder why. I guess the
| judge assumes it isn't being used for copyright infringement,
| but other plans might be?
| bee_rider wrote:
| No idea, but just to speculate--the court's goal isn't
| actually to scare OpenAI's users or harm their business,
| right? It is to collect evidence. Maybe they just figured
| they don't need to dip into that pool to get enough
| evidence.
| Grikbdl wrote:
| Who knows, it's probably the judge's twisted idea of
| "that'd be too far", as if cancelling basic privacy
| expectations of all users everywhere wouldn't be.
| landonxjames wrote:
| Repeatedly calling the lawsuit baseless feels like it makes Open
| AI's point a lot weaker. They obviously don't like the suit, but
| I don't think you can credibly argue that there aren't tricky
| questions around the use of copyrighted materials in training
| data. Pretending otherwise is disingenuous.
| sigilis wrote:
| They pay their lawyers and whoever made this page a lot for the
| express purpose of credibly arguing that it is very clearly
| totally legal and very cool to use of any IP they want to train
| their models.
|
| Could you with a straight face argue that the NYT newspaper
| could be a surrogate girlfriend for you like a GPT can be? They
| maintain that it is obviously a transformative use and
| therefore not an infringement of copyright. You and I may
| disagree with this assertion, but you can see how they could
| see this as baseless, ridiculous, and frivolous when their
| livelihoods depend on that being the case.
| Caelus9 wrote:
| Honestly, this incident makes me feel that it is really difficult
| to draw a clear line between "protecting privacy" and "obeying
| the law". On the one hand, I am very relieved that OpenAI stood
| up and said "no". After all, we all know that these systems
| collect everything by default, which makes people a little panic.
| But on the other hand, it sounds very strange that the court can
| directly say "give me all the data", even those that users
| explicitly delete. Moreover, this also shows that everyone
| actually cares about their information and privacy now. No one
| wants to be used for anything casually.
| wand3r wrote:
| Does anyone know how this can be enforced?
|
| The ruling and situation aside, to what degree is it possible to
| enforce something like this and what are the penalties? Even in
| GDPR and other data protection cases, it seems super hard to
| enforce. Directives to keep or delete data basically require
| system level access, because the company can always CRUD their
| data whenever they want and whatever is in their best interest.
| Data can ask to be produced to a court periodically and audited
| which could maybe catch an individual case, I guess. There is
| basically no way to know without literally seizing the servers in
| an extreme case. Also, the consequences in most cases are a fine.
| mmooss wrote:
| This isn't the executive branch of the US government, which has
| Constitutional powers. It's a private company and the court can
| at least enforce massive penalties, presumptions against them
| at trial (causing them to lose), and contempt of court. Talk to
| a lawyer before you try something like it.
| imiric wrote:
| > the court can at least enforce massive penalties
|
| A.k.a. the cost of doing business.
| mmooss wrote:
| Businesses care deeply about money. The bravado of many
| businesspeople these days, that they are immune to
| criticism, lawsuits, etc. is a bluff. It apparently works,
| because many people repeat it.
| imiric wrote:
| When fines are a small percentage of the company's
| revenue, they do nothing to stop them from breaking the
| law. So they are in fact just the cost of doing business.
|
| E.g. Meta has been fined billions many times, yet they
| keep reoffending. It's basically become a revenue stream
| for governments.
| delusional wrote:
| I have no time for this circus.
|
| The technology anarchists in this thread need perspective. This
| is fundamentally a case about the legality of this product. In
| the extreme case, this will render the whole product category of
| "llm trained on copyrighted content" illegal. In that case, you
| will have been part of a copyright infringement on a truly
| massive scale. The users of these tools do NOT deserve privacy in
| the light of the crimes alleged.
|
| You do not get to claim to protect the privacy of the customers
| of your illegal venture.
| 6510 wrote:
| The harm this is doing and will do (regardless) seems to exceed
| the value of the NYT.
|
| If a company is subject to a US court order that violates EU law,
| the company could face legal consequences in the EU for non-
| compliance with EU law.
|
| The GDPR mandates specific consent and legal bases for processing
| data, including sharing it.
|
| Assuming it is legal to share it for legal purposes one cant
| sufficiently anonymize the data. It needs to be accompanied by
| user data that allows requests to download it and for it to be
| deleted.
|
| I wonder what the fine would be if they just delete it per user
| agreement.
|
| I also wonder, could one, in the US, legally promise the customer
| they may delete their data then chose to keep it indefinitely and
| share it with others?
| dvt wrote:
| > Does this court order violate GDPR or my rights under European
| or other privacy laws?
|
| > We are taking steps to comply at this time because we must
| follow the law, but The New York Times' demand does not align
| with our privacy standards. That is why we're challenging it.
|
| So basically no, lol. I wonder if we'll see the GDPR go head-to-
| head with Copyright Law here, that would be way more fun than
| OpenAI v NYT.
| yoaviram wrote:
| >Trust and privacy are at the core of our products. We give you
| tools to control your data--including easy opt-outs and permanent
| removal of deleted ChatGPT chats (opens in a new window) and API
| content from OpenAI's systems within 30 days.
|
| No you don't. You charge extra for privacy and list it as a
| feature on your enterprise plan. Not event paying pro customer
| get "privacy". Also, you refuse to delete personal data included
| in your models and training data following numerous data
| protection requests.
| baxtr wrote:
| This is a typical "corporate speak" / "trustwahsing" statement.
| It's usually super vague, filled with feel-good buzzwords, with
| a couple of empty value statements sprinkled on top.
| that_was_good wrote:
| Except all users can opt out. Am I missing something?
|
| It says here:
|
| > If you are on a ChatGPT Plus, ChatGPT Pro or ChatGPT Free
| plan on a personal workspace, data sharing is enabled for you
| by default, however, you can opt out of using the data for
| training.
|
| Enterprise is just opt out by default...
|
| https://help.openai.com/en/articles/8983130-what-if-i-want-t...
| agos wrote:
| what about all the rest of the data they use for training,
| there's no opt out from that
| bartvk wrote:
| Indeed. Click your profile in the top right, click on the
| settings icon. In Settings, select "Data Controls" (not
| "privacy") and then there's a setting called "Improve the
| model for everyone" (not "privacy" or "data sharing") and
| turn it off.
| bugtodiffer wrote:
| so they technically kind of follow the law but make it as
| hard as possible?
| bartvk wrote:
| Personally I feel it's okay but kinda weird. I mean why
| not call it privacy. Gray pattern, IMHO. For example
| venice.ai simply doesn't have a privacy setting because
| they don't use the data from chats. (They do have basic
| telemetry, and the setting is called "Disable Telemetry
| Collection").
| atoav wrote:
| Not sharing you data with other users does not mean the data
| of a deleted chat are gone, those are very likely two
| completely different mechanisms.
|
| And whether and how they use your data for their own purposes
| isn't touched by that either.
| Kiyo-Lynn wrote:
| Lately I'm not even sure if the things I say on OpenAI are really
| mine or just part of the platform. I never used to think much
| when chatting, but knowing some of it might be stored for a long
| time makes me feel uneasy. I'm not asking for much. I just want
| what I delete to actually be gone.
| nraynaud wrote:
| Isn't Altman collecting millions of eye scans? Since when did he
| care about privacy?
| CjHuber wrote:
| Even though how they responded is definitely controversial, I'm
| glad that they did publicize some response to it. After reading
| about it in the news yesterday and seeing no response on their
| side yet, I was worried that they would just keep silent
| molf wrote:
| It would help tremendously if OpenAI would make it possible to
| apply for zero data retention (ZDR). For many business needs
| there is no reason to store or log any request at all.
|
| In theory it is possible to apply (it's mentioned on multiple
| locations in the documentation), but in practice requests are
| just being ignored. I get that approval needs to be given, and
| that there are barriers to entry. But it seems to me they mention
| zero-data retention only for marketing purposes.
|
| We have applied multiple times and have yet to receive ANY
| response. Reading through the forums this seems very common.
| pclmulqdq wrote:
| The missing ingredient is money.
| jewelry wrote:
| not just money. How are you going to support this client's
| support ticket if there is no log at all?
| ethbr1 wrote:
| Don't. "We're unable to provide support for your request,
| because you disabled retention." Easy.
| hirsin wrote:
| They don't care, they still want support and most
| leadership teams are unwilling to stand behind a stance
| of telling customers no.
| abeppu wrote:
| ... but why is not responding to a request for zero
| retention today better than not being able to respond to
| a future request? They're basically already saying no to
| customers who request this capability that they said they
| support, but their refusal is in the form of never
| responding.
| belter wrote:
| If this stands I dont think they can operate in the EU
| bunderbunder wrote:
| I highly doubt this court order affects people using OpenAI
| services from the EU, as long as they're connecting to EU-
| based servers.
| glookler wrote:
| >> Does this court order violate GDPR or my rights under
| European or other privacy laws?
|
| >> We are taking steps to comply at this time because we
| must follow the law, but The New York Times' demand does
| not align with our privacy standards. That is why we're
| challenging it.
| danielfoster wrote:
| They didn't say which law (the US judge's order or EU
| law) they are complying with.
| lmm wrote:
| > In theory it is possible to apply (it's mentioned on multiple
| locations in the documentation), but in practice requests are
| just being ignored. I get that approval needs to be given, and
| that there are barriers to entry. But it seems to me they
| mention zero-data retention only for marketing purposes.
|
| What's the betting that they just write it on the website and
| never actually implemented it?
| sigmoid10 wrote:
| Tbf the approach seems pretty standard. Azure also only
| offers zero retention to vetted customers and otherwise
| retains data for up to 30 days to monitor and detect abuse.
| Since the possibilities for abuse are so high with these
| models, it would make sense that they don't simply give that
| kind of privilege to everyone - if only to cover their own
| legal position.
| ArnoVW wrote:
| My understanding is that they log 30 days by default, for
| handling of bugs. And that you can request 0 days. This is from
| their documentation
| lcnPylGDnU4H9OF wrote:
| > And that you can request 0 days.
|
| Right but the problem they're having is that the request is
| ignored.
| miles wrote:
| > I get that approval needs to be given, and that there are
| barriers to entry.
|
| Why is approval necessary, and what specific barriers (before
| the latest ruling) prevent privacy and no logging from being
| the default?
|
| OpenAI's assurances have long been met with skepticism by many,
| with the assumption that inputs are retained, analyzed, and
| potentially shared. For those concerned with genuine privacy,
| local LLMs remain essential.
| AlecSchueler wrote:
| > what specific barriers (before the latest ruling) prevent
| privacy and no logging from being the default?
|
| Product development?
| 1vuio0pswjnm7 wrote:
| "You can also request zero data retention (ZDR) for eligible
| endpoints if you have a qualifying use-case. For details on
| data handling, visit our Platform Docs page."
|
| https://openai.com/en-GB/policies/row-privacy-policy/
|
| 1. You can request it but there is no promise the request will
| be granted.
|
| Defaults matter. Silicon Valley's defaults are not designed for
| privacy. They are designed for profit. OpenAI's default is
| retention. Outputs are saved by default.
|
| It is difficult to take the arguments in their memo ISO
| objection to the preservation order seriously. OpenAI already
| preserves outputs by default.
| mediumsmart wrote:
| Its a newspaper. They are sold for a price, not to one person and
| they dont come with an nda. They become part of history and
| Society.
| conartist6 wrote:
| Hey OpenAI! In your "why is this happening" you left some bits
| out.
|
| You make it sound like they're mad at you for no reason at all.
| How unreasonable of them when confronted with such honorable
| folks as yourselves!
| energy123 wrote:
| > Consumer customers: You control whether your chats are used to
| help improve ChatGPT within settings, and this order doesn't
| change that either.
|
| Within "settings"? Is this referring to the dark pattern of
| providing users with a toggle "Improve model for everyone" that
| doesn't actually do anything? Instead users must submit a request
| manually on a hard to discover off-app portal, but this dark
| pattern has deceived them into think they don't need to look for
| it.
| sib301 wrote:
| Can you please elaborate?
| energy123 wrote:
| To opt-out of your data being trained on, you need to go to
| https://privacy.openai.com and click the button "Make a
| Privacy Request".
| alextheparrot wrote:
| in the app: Settings ~> Data Controls ~> Improve the model
| for everyone
| curtisblaine wrote:
| Yes, could you please explain why toggling "Improve model for
| everyone" off doesn't do anything and provide a link to this
| off-portal app that you mention?
| jamesgill wrote:
| Follow the money.
| udev4096 wrote:
| The irony is palpable here
| hombre_fatal wrote:
| You know how it's always been a meme that you'd be mortally
| embarrassed if your browser history ever leaked?
|
| Imagine how much worse it is for your LLM chat history to leak.
|
| It's even worse than your private comms with humans because it's
| a raw look at how you are when you think you're alone, untempered
| by social expectations.
| vitaflo wrote:
| WTF are you asking LLMs and why would you expect any of it to
| be private?
| ofjcihen wrote:
| "Write a song in the style of Slipknot about my dumb inbred
| dogs. I love them very much but they are...reaaaaally dumb."
|
| To be fair the song was intense.
| hombre_fatal wrote:
| It's not that the convos are necessarily icky.
|
| It's that it's like watching how someone might treat a slave
| when they think they're alone. And how you might talk down to
| or up to something that looks like another person. And how
| pathetic you might act when it's not doing what you want. And
| what level of questions you outsource to an LLM. And what
| things you refuse to do yourself. And how petty the tasks
| might be, like workshopping a stupid twitter comment before
| you post it. And how you copied that long text from your
| distraught girlfriend and asked it for some response ideas.
| etc. etc. etc.
|
| At the very least, I'd wager that it reveals that bit of true
| helpless patheticness inherent in all of us that we try so
| hard to hide.
|
| Show me your LLM chat history and I will learn a lot about
| your personality. Nothing else compares.
| Jackpillar wrote:
| Might have to reemphasize his question again but - what
| questions are you asking your LLM? Why are you responding
| to it and/or "treating" it differently then how you would a
| calculator or search engine.
| hombre_fatal wrote:
| Because it's far more capable than a calculator or search
| engine and because you interact with it with
| conversational text, it reveals more aspects about your
| personality.
|
| Why might your search engine queries reveal more about
| you than your keystrokes in a calculator? Now dial that
| up.
| Jackpillar wrote:
| Sure - but I don't interact with it as if its human so my
| demeanor or attitude is neutral because I'm talking to
| you know - a computer. Are you getting emotional with and
| reprimanding your chatbot?
| hombre_fatal wrote:
| I don't get why I'm receiving pushback here. How you
| treat the LLM was only a fraction of my examples for ways
| you can look pathetic if your chats were made public.
|
| You don't reprimand the google search box, yet your
| search history might still be embarrassing.
| hackinthebochs wrote:
| Your points were very accurate and relevant. Some people
| have a serious lack of imagination. The perpetual
| naysayers will never have their minds changed.
| hombre_fatal wrote:
| Good god, thank you. I thought I was making an obvious,
| unanimous point when I wrote that first comment.
| AlecSchueler wrote:
| It's so tiring to read. You're making a reasonable point.
| Some people can't believe that other people behave or
| feel differently to themselves.
| alec_irl wrote:
| > how you copied that long text from your distraught
| girlfriend and asked it for some response ideas
|
| good lord, if tech were ethical then there would be
| mandatory reporting when someone consults an LLM to tell
| them how they should be responding to their intimate
| partner. are your skills of expression already that hobbled
| by chat bots?
| hombre_fatal wrote:
| These are just concrete examples to get the imagination
| going, not an exhaustive list of the ways that you are
| revealing your true self in the folds of your LLM chat
| history.
|
| Note that it doesn't have to go all the way to "he gets
| Claude to help him win text arguments with his gf" for an
| uncomfortable amount of your self to be revealed by the
| chats.
|
| There is always something icky about someone observing
| messages you wrote in privacy, and you don't have to have
| particularly unsavory messages for it to be icky. Why is
| that?
| alec_irl wrote:
| i don't personally see messages with an LLM as being
| different from, say, terminal commands. it's a machine
| interface. it sounds like you're anthropomorphizing the
| chat bot, if you're talking to it like you would a human
| then i would be more worried about the implications that
| has for you as a person.
| hombre_fatal wrote:
| Focusing on how you anthropomorphize the LLM isn't really
| interacting with the point since it was one example.
|
| Might someone's google search history be embarrassing
| even though they don't treat google like a human?
| AlecSchueler wrote:
| What does this comment add to the conversation? It feels
| like an personal attack with no real rebuttal. People
| with anthropomorphise them all talk to them, the human-
| like interface is the entire selling point.
| lcnPylGDnU4H9OF wrote:
| > are your skills of expression already that hobbled by
| chat bots?
|
| You have it backwards. My skills of expression were
| hobbled by my upbringing, and others' thoughts on self-
| expression allowed my skills to flourish. I _wish_ I had
| a chat bot to help me understand interpersonal
| communication because I could have actually had good
| examples growing up.
| threecheese wrote:
| This product is positioned as a personal copilot, and future
| iterations (based on leaked plans, may or may not be true) as
| a wholly integrated life assistant.
|
| Why would a customer expect this not to be private? How can
| one even know how it could be used against them, when they do
| t even know what's being collected or gleaned from collected
| data?
|
| I am following these issues closely, as I am terrified that
| my "assistant" will some day prevent me from obtaining
| employment, insurance, medical care etc. And I'm just a non
| law breaking normie.
|
| A current day example would be TX state authorities using
| third party social/ad data to identify potentially pregnant
| women along with ALPR data purchased from a third party to
| identify any who attempt to have an out of state abortion, so
| they can be prosecuted. Whatever you think about that law, it
| is terrifying that a shift in it could find arbitrary digital
| signals being used against you in this way.
___________________________________________________________________
(page generated 2025-06-06 23:01 UTC)