[HN Gopher] OpenAI slams court order to save all ChatGPT logs, i...
___________________________________________________________________
OpenAI slams court order to save all ChatGPT logs, including
deleted chats
Author : ColinWright
Score : 1059 points
Date : 2025-06-04 21:47 UTC (1 days ago)
(HTM) web link (arstechnica.com)
(TXT) w3m dump (arstechnica.com)
| ColinWright wrote:
| Full post:
|
| _" After court order, OpenAI is now preserving all ChatGPT user
| logs, including deleted chats, sensitive chats, etc."_
| righthand wrote:
| Sounds like deleted chats are now hidden chats. Off the record
| chats are now on the record.
| hyperhopper wrote:
| This is the real news. It should be illegal to call something
| deleted when it is not.
| JKCalhoun wrote:
| "Marked" for deletion.
| Aeolun wrote:
| Or maybe it should be illegal to have a court order that
| the privacy of millions of people should be infringed? I'm
| with OpenAI on this one, regardless of their less than pure
| reasons. You don't get to wiretap all of the US population,
| and that's essentially what they are doing here.
| amanaplanacanal wrote:
| They are preserving evidence in a lawsuit. If you are
| concerned, you can try petitioning the court to keep your
| data private. I don't know how that would go.
| djrj477dhsnv wrote:
| The privacy of millions of people should take precedence
| over ease of evidence collection for a lawsuit.
| Aeolun wrote:
| You can use that same argument for wiretapping the US,
| because surely someone did something wrong. So we should
| just collect evidence on everyone on the off chance we
| need it.
| baobun wrote:
| That's already the case. Ever looked into the Snowden
| leaks?
| girvo wrote:
| > It should be illegal to call something deleted when it is
| not.
|
| I don't disagree, but that ship sailed at least 15+ years
| ago. Soft delete is the name of the game basically
| everywhere...
| eurekin wrote:
| At work we dutifully delete all data on a GDPR request
| simonw wrote:
| Purely out of interest, how do you verify that the GDPR
| request is coming from the actual user and not an
| imposter?
| dijksterhuis wrote:
| > The organisation might need you to prove your identity.
| However, they should only ask you for just enough
| information to be sure you are the right person. If they
| do this, then the one-month time period to respond to
| your request begins from when they receive this
| additional information.
|
| https://ico.org.uk/for-the-public/your-right-to-get-your-
| dat...
| eurekin wrote:
| In my domain, our set of services only authorizes
| Customer Centre system to do so. I guess I'd need to ask
| them for details, but I always assumed they have checks
| in place
| sahila wrote:
| How do you manage deleting data from backups? Do you know
| not take backups?
| Gigachad wrote:
| Probably most just ignore backups. But there were some
| good proposals where you encrypt every users data with
| their own key. So a full delete is just deleting the
| users encryption key, rendering all data everywhere
| including backups inaccessible.
| liamYC wrote:
| Smart, how do you backup the users encryption keys?
| aiiane wrote:
| A set of encryption keys is a lot smaller than the set of
| all user data, so it's much more viable to have both more
| redundant hot storage and more frequently rotated cold
| storage of just the keys.
| jandrewrogers wrote:
| Deletion via encryption only works if every user's data
| is completely separate from every other user's data in
| the storage layer. This is rarely the case in databases,
| indexes, etc. It also is often infeasible if the number
| of users is very large (key schedule state alone will
| blow up your CPU cache).
|
| Databases with data from multiple users largely can't
| work this way unless you are comfortable with a several
| order of magnitude loss of performance. It has been built
| many times but performance is so poor that it is deemed
| unusable.
| alisonatwork wrote:
| Some of these issues could perhaps be addressed by having
| fixed retention of PII in the online systems, and
| encryption at rest in the offline systems. If a user
| wants to access data of theirs which has gone offline,
| they take the decryption hit. Of course it helps to be
| critical about how much data should be retained in the
| first place.
|
| It is true that protecting the user's privacy costs more
| than not protecting it, but some organizations feel a
| moral obligation or have a legal duty to do so. And some
| users value their own privacy enough that they are
| willing to deal with the decreased convenience.
|
| As an engineer, I find it neat that figuring out how to
| delete data is often a more complicated problem than
| figuring out how to create it. I welcome government
| regulations that encourage more research and development
| in this area, since from my perspective that aligns
| actually-interesting technical work with the public good.
| jandrewrogers wrote:
| > As an engineer, I find it neat that figuring out how to
| delete data is often a more complicated problem than
| figuring out how to create it.
|
| Unfortunately, this is a deeply hard problem _in theory_.
| It is not as though it has not been thoroughly studied in
| computer science. When GDPR first came out I was actually
| doing core research on "delete-optimized" databases. It
| is a problem in other domains. Regulations don't have the
| power to dictate mathematics.
|
| I know of several examples in multiple countries where
| data deletion laws are flatly ignored by the government
| because it is literally impossible to comply even though
| they want to. Often this data supports a critical public
| good, so simply not collecting it would have adverse
| consequences to their citizens.
|
| tl;dr: delete-optimized architectures are so profoundly
| pathological to query performance, and a lesser extent
| insert performance, that no one can use them for most
| practical applications. It is fundamental to the computer
| science of the problem. Denial of this reality leads to
| issues like the above where non-compliance is required
| because the law didn't concern itself with the physics of
| computation.
|
| If the database is too slow to load the data then it
| doesn't matter how fast your deterministic hard deletion
| is because there is no data to delete in the system.
|
| Any improvements in the situation are solving minor
| problems in narrow cases. The core theory problems are
| what they are. No amount of wishful thinking will change
| this situation.
| alisonatwork wrote:
| It would be interesting to hear more about your
| experience with systems where deletion has been deemed
| "literally impossible".
|
| Every database I have come across in my career has a
| delete function. Often it is slow. In many places I
| worked, deleting or expiring data cost almost as much as
| or sometimes more than inserting it... but we still
| expired the data because that's a fundamental requirement
| of the system. So everything costs 2x, so what? The
| interesting thing is how to make it cost less than 2x.
| Gigachad wrote:
| Instantaneous deletes might be impossible, but I really
| doubt that it's physically impossible to eventually
| delete user data. If you soft delete first to hide user
| data, and then maybe it takes hours, weeks, months to
| eventually purge from all systems, that's fine.
| Regulators aren't expecting you to edit old backups, only
| that they eventually get cleared in reasonable time.
|
| Seems that companies are capable of moving mountains when
| the task is tracking the user and bypassing privacy
| protections. But when the task is deleting the users data
| it's "literally impossible"
| blagie wrote:
| The entire mess isn't with data in databases, but on
| laptops for offline analysis, in log files, backups, etc.
|
| It's easy enough to have a SQL query to delete a users'
| data from the production database for real.
|
| It's all the other places the data goes that's a mess,
| and a robust system of deletion via encryption could work
| fine in most of those places, at least in the abstract
| with the proper tooling.
| catlifeonmars wrote:
| You can use row based encryption and store the encrypted
| encryption key alongside each row. You use a master key
| to decrypt the row encryption key and then decrypt the
| row each time you need to access it. This is the standard
| way of implementing it.
|
| You can instead switch to a password-based key derivation
| function for the row encryption key if you want the row
| to be encrypted by a user provided password
| jandrewrogers wrote:
| This has been tried many times. The performance is so
| poor as to be unusable for most applications. The
| technical reasons are well-understood.
|
| The issue is that, at a minimum, you have added 32 bytes
| to a row just for the key. That is extremely expensive
| and in many cases will be a large percentage of the
| entire row; many years ago PostgreSQL went to heroic
| efforts to reduce _2_ bytes per row for performance
| reasons. It also limits you to row storage, which means
| query performance will be poor.
|
| That aside, you overlooked the fact that you'll have to
| compute a key schedule for each row. None of the setup
| costs of the encryption can be amortized, which makes
| processing a row extremely expensive computationally.
|
| There is no obvious solution that actually works. This
| has been studied and implemented extensively. The reason
| no one does it isn't because no one has thought of it
| before.
| catlifeonmars wrote:
| You're not wrong about the downsides. However you're
| wrong about the costs being prohibitive on general. I've
| personally worked on quite a few applications that do
| this and the additional cost has never been an issue.
|
| Obviously context matters and there are some applications
| where the cost does not outweigh the benefit
| infinite8s wrote:
| I think you and the GP are probably talking about
| different scale orders of magnitude.
| alisonatwork wrote:
| Backups can have a fixed retention period.
| sahila wrote:
| Sure, but now when the backup is restored two weeks
| later, is the user redeleted or just forgotten about?
| alisonatwork wrote:
| Depends on the processes in place at the company.
| Presumably if a backup is restored, some kind of replay
| has to happen after that, otherwise all the other users
| are going to lose data that arrived in the interim. A
| catastrophic failure where both two weeks of user data
| and all the related events get irretrievably blackholed
| could still happen, sure, but any company where that is a
| regular occurrence likely has much bigger problems than
| complying with GDPR.
|
| The point is that none of these problems are
| insurmountable - they are all processes and practices
| that have been in place since long before GDPR and long
| before I started in this industry 25+ years ago. Even if
| deletion is only eventually consistent, even if a few
| pieces of data slip through the cracks, it is not hard to
| have policies in place that at least provide a best
| effort at upholding users' privacy and complying with the
| regulations.
|
| Organizations who choose not to bother, claiming that
| it's all too difficult, or that because deletion cannot
| be done 100% perfectly it should not even be attempted at
| all, are making weak excuses. The cynical take would be
| that they are just covering for the fact that they really
| do not respect their users' privacy and simply do not
| want to give up even the slightest chance of extracting
| value from that data they illegally and immorally choose
| to retain.
| crdrost wrote:
| "When data subjects exercise one of their rights, the
| controller must respond within one month. If the request
| is too complex and more time is needed to answer, then
| your organisation may extend the time limit by two
| further months, provided that the data subject is
| informed within one month after receiving the request."
|
| Backup retention policy 60 days, respond within a week or
| two telling someone that you have purged their data from
| the main database but that these backups exist and cannot
| be changed, but that they will be automatically deleted
| in 60 days.
|
| The only real difficulty is if those backups are actually
| restored, then the user deletion needs to be replayed,
| which is something that would be easy to forget.
| Trasmatta wrote:
| Most companies don't keep all backups in perpetuity, and
| instead have rolling backups over some period of time.
| gruez wrote:
| That won't work in this case, because I doubt GDPR
| requests override court orders.
| miki123211 wrote:
| This is very, very hard in practice.
|
| With how modern systems, languages, databases and file
| systems are designed, deletion often means "mark this as
| deleted" or "erase the location of this data". This is
| true on all possible levels of the stack, from hardware
| to high-level application frameworks.
|
| Changing this would slow computers down massively. Just
| to give a few examples, backups would be prohibited, so
| would be garbage collection and all existing SSD drives.
| File systems would have to wipe data on unlink(), which
| would increase drive wear and turn operations which
| everybody assumed were O(1) for years into O(n), and
| existing software isn't prepared for that. Same with
| zeroing out memory pages, OSes would have to be
| redesigned to do it all at once when a process
| terminates, and we just don't know what the performance
| impact of that would be.
| Gigachad wrote:
| You just do it the way fast storage wipes do it. Encrypt
| everything, and to delete you delete the decryption key.
| If a user wants to clear their personal data, you delete
| their decryption key and all of their data is burned
| without having to physically modify it.
| jandrewrogers wrote:
| That only works if you have a single key at the block
| level, like an encryption key per file. It essentially
| doesn't work for data that is finely mixed with different
| keys such as in a database. Encryption works on byte
| blocks, 16-bytes in the case of AES. Modern data
| representations interleave data at the bit level for
| performance and efficiency reasons. How do you encrypt a
| block with several users data in it? Separating these out
| into individual blocks is _extremely_ expensive in
| several dimensions.
|
| There have been several attempts to build e.g. databases
| that worked this way. The performance and scalability was
| so poor compared to normal databases that they were
| essentially unusable.
| girvo wrote:
| It would be very hard to change technically, yes.
|
| But that's not the only solve. It's _easy_ to change the
| words we use instead to make it clear to users that the
| data isn 't irrevocably deleted.
| aranelsurion wrote:
| Consequently all your "deleted chats" might one day
| become public if someone manages to dump some tables off
| OpenAI's databases.
|
| Maybe not today on its heyday, but who knows what happens
| in 20 years once OpenAI becomes Yahoo of AI, or loses
| much of its value, gets scrapped for parts and bought by
| less sophisticated owners.
|
| It's better to regard that data as already public.
| jandrewrogers wrote:
| The concept of "deleted" is not black and white, it is a
| continuum (though I agree that this is a very soft delete).
| As a technical matter, it is surprisingly difficult and
| expensive to unrecoverably delete something with high
| assurance. Most deletes in real systems are much softer
| than people assume because it dramatically improves
| performance, scalability, and cost.
|
| There have been many attempts to build e.g. databases that
| support deterministic hard deletes. Unfortunately, that
| feature is sufficiently ruinous to efficient software
| architecture that performance is extremely poor such that
| no one uses them.
| tarellel wrote:
| I'm sure this has been the case all along.
| causal wrote:
| I know this is a popular suspicion but some companies
| really do take privacy seriously, especially when operating
| in Europe
| 3abiton wrote:
| Does that fly in the EU?
| ColinWright wrote:
| Just some context ...
|
| The original submission was a link to a post on Mastodon. The
| post itself was too long to fit in the title, so I trimmed it,
| and put the full post here in a comment.
|
| But with the URL in the submission being changed, this doesn't
| really make sense any more! In the future I'll make sure I
| include in the comment the original link with the original text
| so it makes sense even if (when?) the submission URL gets
| changed.
| bluetidepro wrote:
| More context: https://arstechnica.com/tech-policy/2025/06/openai-
| says-cour...
| gmueckl wrote:
| That's the true source. Should the link be updated to this
| article?
| cwillu wrote:
| Email hn@ycombinator.com and they'll probably change it.
| basilgohar wrote:
| [flagged]
| hsbauauvhabzb wrote:
| > but in a way that helps common people
|
| That'll be the day. But even if it does happen, major AI
| players have the resources to move to a more 'flexible'
| country, if there isn't a loophole that involves them closing
| their eyes really really tight while outsourced webscrapers
| collect totally legit and not illegally obtained data
| telchior wrote:
| You're being generous to even grant an "even if it does"
| proposition. Considering the people musing about "reform"
| of copyright at the moment -- Jack Dorsey's little flip
| "delete all IP law" comes to mind -- the clear direction
| we're headed is toward artistic and cultural serfdom.
| krick wrote:
| In all fairness, the essence of it doesn't have to do
| anything with copyright. "Pro-copyright" is old news.
| Everyone knows these companies shit on copyright, but so do
| users, and the only reason why users _sometimes_ support the
| "evil pro-copyright shills" narrative is because we are
| bitter that Facebook and OpenAI can get away with that, while
| common peasants are constantly under the risk of being fucked
| up for life. The news is big news only because of "anti-
| ChatGPT" part, and everyone is a user of ChatGPT now (even
| though 50% of them hate it). Moreover, it's only big news
| because the users are directly concerned: if OpenAI would
| have to pay big fine and continue business as usual, the
| comments would largely be schadenfreude.
|
| And the fact that the litigation was over copyright is an
| insignificant detail. It could have been anything. Literally
| anything, like a murder investigation, for example. It only
| helps OpenAI here, because it's easy to say "nobody cares
| about copyright", and "nobody cares about murder" sounds less
| defendable.
|
| Anyway, the issue here is not copyright, nor "AI", it's the
| venerated legal system, which very much by design allows for
| a single woman to decide on a whim, that a company with
| millions of users must start collecting user data, while
| users very much don't want that, and the company claims it
| doesn't want that too (mostly, because it knows how much
| users don't want that: otherwise it'd be happy to).
| Everything else is just accidental details, it really has
| nothing to do neither with copyright, nor with "AI".
| dijksterhuis wrote:
| My favourite comment:
|
| >> Wang apparently thinks the NY Times' boomer copyright
| concerns trump the privacy of EVERY @OpenAI USER--insane!!!
| -- someone on twitter
|
| > Apparently not having your shit stolen is a boomer idea
| now.
| AlienRobot wrote:
| Classic "someone on Twitter" take.
| infotainment wrote:
| Ars comments, in general, are hilariously bad.
|
| It's surprising to me, because you'd think a site like Ars
| would attract a generally more knowledgable audience, but
| reading through their comment section feels like looking at
| Twitter or YouTube comments -- various incendiary and
| unsubstantial hot takes.
| sevensor wrote:
| The ars-technica.com forums were pretty good, 2.5e-1
| centuries ago.
| johnnyanmac wrote:
| I'm "pro-copyright" in that I want the corporations that
| setup this structure to suffer under it the way we did for
| 25+ years. They can't just ignore the rules they spent
| millions lobbying for when they feel it's convinient.
|
| On the other end: while copyright has been perverted over the
| centuries, the goal is still overall to protect small
| inventors. They have no leverage otherwise and this gives
| them some ability to fight if they aren't properly
| compensated. I definitely do not want it abolished outright.
| Just reviewed and reworked for modern times.
| dmix wrote:
| Corporations are not a monolith. Silicon Valley never
| lobbied for copyright AFAIK
|
| Google and others fought it pretty hard
| tomhow wrote:
| Thanks, we updated the URL to this from
| https://mastodon.laurenweinstein.org/@lauren/114627064774788...
| zombiwoof wrote:
| Palantir wants them
| bigyabai wrote:
| It's not like Sam Altman has been particularly hostile to the
| present administration. He's probably already handing them over
| behind closed doors and doesn't want to take the PR hit.
| nickv wrote:
| Give me a break, they're literally spending money fighting
| this court order.
| bigyabai wrote:
| They're literally salaried lawyers. The average American
| taxpayer is spending more on legal representation for this
| case than OpenAI is.
|
| It's a publicity stunt, ordered by executives. If you think
| OpenAI is doing this out of principle, you're nuts.
| Draiken wrote:
| For the sole reason that this costs money to do, not out of
| the goodness of their hearts.
| LightBug1 wrote:
| I'd rather use Chinese LLM's than put up with this horseshit.
| SchemaLoad wrote:
| At least the DeepSeek lets you run it locally.
| romanovcode wrote:
| NVIDIA should just release a box and say "THIS WILL RUN
| DEEPSEEK LOCALLY VERY FAST. 3000 USD."
| mensetmanusman wrote:
| Slavery or privacy!
| solardev wrote:
| Communism AND barbarism, 2 for the price of 1!
| LightBug1 wrote:
| He says ... while typing away on Chinese technology.
|
| Disclaimer: I'm not Chinese. But I recognise crass
| hypocrisy when I see it.
| LightBug1 wrote:
| Slavery or no privacy? ... what's the difference?
|
| Bodily slavery or mental slavery ... take your pick.
| AStonesThrow wrote:
| Ask any Unix filesystem developer, and they'll tell you that
| unlink(2) on a file does not erase any of its data, but simply
| enables the reuse of those blocks on disk.
|
| Whenever I "delete" a social media account, or "Trash" anything
| on a cloud storage provider, I repeat the mantra, "revoking
| access for myself!" which may be sung to the tune of "Evergreen".
| II2II wrote:
| In the first case, there is nothing preventing the development
| of software to overwrite data before unlink(2) is called.
|
| In the second case, you can choose to trust or distrust the
| cloud storage provider. Trust being backed by contractual
| obligations and the right to sue if those obligations are not
| met. Of course, most EULAs for consumer products are toothless
| is this respect. On the other hand, that doesn't prevent
| companies from offering contracts which have some teeth (which
| they may do for business clients).
| wolfgang42 wrote:
| _> there is nothing preventing the development of software to
| overwrite data before unlink(2) is called._
|
| It's not that simple: this command already exists, it's
| called `shred`, and as the manual[1] notes:
|
| The shred command relies on a _crucial assumption:_ that the
| file system and hardware overwrite data in place. Although
| this is common and is the traditional way to do things, many
| modern file system designs do not satisfy this assumption.
|
| [1] https://www.gnu.org/software/coreutils/manual/html_node/s
| hre...
| grg994 wrote:
| A reasonable cloud storage provider stores your data encrypted
| on disk. Certain standards like HIPPA mandates this.
|
| Deletion of data is achieved by permanently discarding the
| encryption key which is stored and managed elsewhere where
| secure erasure can be guaranteed.
|
| If implemented honestly, this procedure WORKS and cloud storage
| is secure. Yes the emphasis is on the "implemented honestly"
| part but do not generalize cloud storage as inherently
| insecure.
| david_shi wrote:
| there's no indication at all on the app that this is happening
| pohuing wrote:
| The docs contain a sentence on them retaining any chats that
| they legally have to retain. This is always the risk when doing
| business with law abiding companies which store any data on
| you.
| SchemaLoad wrote:
| They should disable the secret chat functionality immediately
| if it's flat out lying to people.
| baby_souffle wrote:
| Agree. But it's worth noting that they already have a bit
| of a hedge in the description for private mode:
|
| > This chat won't appear in history, use or update
| ChatGPT's memory, or be used to train our models. For
| safety purposes, we may keep a copy of this chat for up to
| 30 days.
|
| The "may keep a copy" is doing a lot of work in that
| sentence.
| SchemaLoad wrote:
| "for up to 30 days" though. If they are being kept in
| perpetuity always they should update the copy to say
| "This chat won't appear in your history, we retain a copy
| indefinitely"
| rvz wrote:
| Another great use-case for local LLMs, given this news.
|
| The government also says thank you for your ChatGPT logs.
| bongodongobob wrote:
| Bad news, they've been sniffing Internet backbones for decades.
| That cat is _way_ the fuck out of the bag.
| 63 wrote:
| While I also disagree with the court order and OpenAI's
| implementation (along with pretty much everything else the
| company does), the conspiratorial thinking in the comments here
| is unfounded. The order is overly broad but in the context of the
| case, it's not totally unwaranted. This is not some conspiracy to
| collect data on people. I'm confident the records will be handled
| appropriately once the case concludes (and if they're not, we can
| be upset then, not now). Let's please reserve our outrage for the
| plethora of very real issues going on right now.
| vlovich123 wrote:
| > I'm confident the records will be handlded appropriately once
| the case concludes (and if they're not, we can be upset then,
| not now)
|
| This makes no sense to me. Shouldn't we address the damage
| before it's done vs handwringing after the fact?
| verdverm wrote:
| the damage to which party?
| Aeolun wrote:
| Certainly not the copyright holders, which have lost any
| form of my sympathy over the past 25 years.
| odo1242 wrote:
| Users, presumably
| pier25 wrote:
| Are we talking about the same company that needs data
| desperately and has used copyrighted material illegally without
| permission?
| simonw wrote:
| When did they use copyrighted material illegally?
|
| I didn't think any of the ongoing "fair use" lawsuits had
| reached a conclusion on that.
| jamessinghal wrote:
| The Thomson Reuters case [1] is the most relevant in the
| court's finding that the copying of copyrighted material
| from Westlaw by Ross Intelligence _was_ direct copyright
| infringement and _was not_ fair use.
|
| The purpose of training in many of the AI Labs being sued
| mostly matches the conditions that Ross Intelligence was
| found to have violated, and the question of copying is
| almost guaranteed if they trained on it.
|
| [1] Thomson Reuters Enterprise Centre GmbH et al v. ROSS
| Intelligence Inc. https://www.ded.uscourts.gov/sites/ded/fi
| les/opinions/20-613...
| simonw wrote:
| Thanks, I hadn't seen that one.
| pier25 wrote:
| ok then let's say they used the copyrighted material
| without permission
| pier25 wrote:
| Sorry, I meant to write "monetized copyrighted material
| without permission".
|
| We'll see if the courts deem it legal but it's, without a
| doubt, unehtical.
| FeepingCreature wrote:
| Eh, I have the opposite view but then again I'm a
| copyright minimalist.
| pier25 wrote:
| So you think artists do not need to be able to make a
| living?
| simonw wrote:
| That's true, they did.
| basilgohar wrote:
| In what world do you live in where corporations have any right
| to a benefit of the doubt? When did that legitimately pan out?
| Aeolun wrote:
| > and if they're not, we can be upset then, not now
|
| Like we could be upset when that credit checking company dumped
| all those social security numbers on the net and had to pay the
| first 200k claimants a grand total of $21 for their trouble?
|
| By that point it's far too late.
| phkahler wrote:
| You're pretty naive. OpenAI is still trying to figure out how
| to be profitable. Having a court order to retain a treasure
| trove of data they were already wanting to keep while offering
| not to, or claiming not to? Hahaha.
| tomnipotent wrote:
| Preserving data for a judicial hold does not give them leeway
| to use that data for other purposes, but don't let facts get
| in the way of your FUD.
| phkahler wrote:
| >> Preserving data for a judicial hold does not give them
| leeway to use that data for other purposes
|
| Does not give them permission. What if LEO asks for the
| data? Should they hand it over just because they have it?
| Remember, this happens all the time with metadata from
| other companies (phone carriers for example). Having the
| data means it's _possible_ to use it for other purposes as
| opposed to not possible. There is always pressure to do so
| both from within and outside a company.
| nickpsecurity wrote:
| They're blaming the court. While there is an order, it is
| happening in response to massive, blatant, and continued I.P.
| infringement. Anyone doing that knows they'll be in court at some
| point. Might have a "duty to preserve" all kinds of data. If they
| keep at it, then they are prioritizing their gains over any
| losses it creates.
|
| In short: OpenAI's business practices caused this. They wouldn't
| have been sued if using legal data. They might still not have an
| order like this if more open about their training, like Allen
| Institute.
| MeIam wrote:
| These AIs have digested all the data in the past. There is no
| fingerprints anymore.
|
| The question is whether AI itself is aware what the source is.
| It certainly knows the source.
| comrade1234 wrote:
| Use deepseek if you don't want the u.s. government monitoring
| you.
| JKCalhoun wrote:
| Better still, local LLM. It's too bad they're not "subscription
| grade".
| mmasu wrote:
| yet - likely subscription grade will stay ahead of the curve,
| but we will soon have very decent models running locally for
| very cheap - like when you play great videogames that are 2/3
| years old on now "cheap"machines
| JKCalhoun wrote:
| Definitely what I am hoping.
| SchemaLoad wrote:
| I tried running the DeepSeek models that would run on my
| 32GB macbook and they were interesting. They could still
| produce good conversation but didn't seem to have the
| entirety of the internet in it's knowledge pool. Asking it
| complex questions lead to it only offering high level
| descriptions and best guess answers.
|
| Feel like they would still be great for a lot of
| applications like "Search my local hard drive for the file
| that matches this description"
| JKCalhoun wrote:
| Yeah, Internet search as a fallback, our chat history and
| "saved info" in the context ... there's a lot OpenAI, et.
| al. give you that Ollama does not.
| GrayShade wrote:
| You can get those in ollama using tools (MCP).
| JKCalhoun wrote:
| Had to ask ChatGPT what MCP (Model Context Protocol)
| referred to.
|
| When I followed up with how to save chat information for
| future use in the LLM's context window, I was given a
| rather lengthy process involving setting up an SQL
| database, writing some Python tp create a "pre-prompt
| injection wrapper"....
|
| That's cool and all, but wishing there was something a
| little more "out of the box" that did this sort of thing
| for the "rest of us". GPT did mention Tome, LM Studio, a
| few others....
| TechDebtDevin wrote:
| I use their api a lot cuz its so cheap but latency is so bad.
| Take8435 wrote:
| This post is about OpenAI keeping chat logs. All DeepSeek API
| calls are kept. https://cdn.deepseek.com/policies/en-
| US/deepseek-privacy-pol...
| TechDebtDevin wrote:
| Yea, I mean I wouldn't send anything to a chinese server I
| thought was sensative. Or any LLM. For what its worth this
| is in bold on their TOS:
|
| PLEASE NOTE: We do not engage in "profiling" or otherwise
| engage in automated processing of Personal Data to make
| decisions that could have legal or similarly significant
| impact on you or others.
| charcircuit wrote:
| That's not how the discovery process works. This data is only
| accessible by OpenAI and requests for discovery will pass
| through OpenAI.
| layer8 wrote:
| Presumably, the court order only applies to the US?
| pdabbadabba wrote:
| I would not assume that it applies only to _users_ located in
| the U.S., if that 's what you mean, since this is designed to
| preserve evidence of alleged copyright infringement.
| layer8 wrote:
| I don't think a US court order can overrule the GDPR for EU
| customers, for example.
| paulddraper wrote:
| Nothing says that laws of different countries can't
| conflict.
|
| Hopefully they don't though.
| swat535 wrote:
| Isn't this why companies incorporate in various nations
| so that they can comply with local regulations ? I'm
| assuming that EU will demand OpenAI to treat EU users
| differently..
| csomar wrote:
| If they did incorporate in the EU and run their servers
| in the EU, the EU entity will be a separate entity and
| (not a lawyer), I think as a result, not the entity
| concerned by this lawsuit.
| fc417fc802 wrote:
| Assuming the EU entity were a subsidiary "but I keep that
| data overseas" seems unlikely to get you off the hook.
| However I don't think you can be ordered to violate local
| law. That would be a weird (and I imagine expensive)
| situation to sort out.
| patmcc wrote:
| The US is perfectly able to give orders to US companies
| that may be against EU law. The GDPR may hold the company
| liable for that.
| layer8 wrote:
| OpenAI Ireland Ltd, the entity that provides ChatGPT
| services to EU residents, is not a US company.
| hedora wrote:
| The US Cloud Act makes it illegal for US companies to
| operate non-e2e encrypted services that are GDPR compliant.
| They have to warrantlessly hand the govt all data they have
| the technical capability to access.
| layer8 wrote:
| OpenAI Ireland Ltd, the entity that provides ChatGPT
| services to EU residents, is not a US company.
| fc417fc802 wrote:
| What makes you think that EU law holds sway over US
| companies? Conversely, would you expect EU companies to
| abide by US law? Can an EU court not make arbitrary demands
| of a company that operates in the EU so long as those
| demands comply with relevant EU law?
| layer8 wrote:
| OpenAI Ireland Ltd, which as an EU resident is the entity
| that provides the ChatGPT services to me (according to
| ChatGPT's own ToS), is within the jurisdiction of the EU,
| not the US.
| fc417fc802 wrote:
| Repeatedly making the same comment all over the local
| tree is childish and effectively spamming.
|
| To your observation, it's certainly relevant to the
| situation at hand but has little to do with your original
| supposition. A US court can order any company that
| operates in the US to do anything within the bounds of US
| law, in the same way that an EU court can do the
| converse. Such an order might well make it impossible to
| legally do business in one or the other jurisdiction.
|
| If OpenAI Ireland is a subsidiary it will be interesting
| to see to what extent the court order applies (or doesn't
| apply) to it. I wonder if it actually operates servers
| locally or if it's just a frontend that sends all your
| queries over to a US based backend.
|
| People elsewhere in this comment section observed that
| the GDPR has a blanket carve out for things that are
| legally required. Seeing as compliance with a court order
| is legally required there is likely no issue regardless.
| Trasmatta wrote:
| The OpenAI docs are now incredibly misleading:
| https://help.openai.com/en/articles/8809935-how-to-delete-an...
|
| > _What happens when you delete a chat?_
|
| > _The chat is immediately removed from your chat history view._
|
| > _It is scheduled for permanent deletion from OpenAI 's systems
| within 30 days, unless:_
|
| > _It has already been de-identified and disassociated from your
| account, or_
|
| > _OpenAI must retain it for security or legal obligations._
|
| That final clause now voids the entire section. All chats are
| preserved for "legal obligations".
|
| I regret all the personal conversations I've had with AI now.
| It's very enticing when you need some help / validation on
| something challenging, but everyone who warned how much of a
| privacy risk that is has been proven right.
| SchemaLoad wrote:
| Feels like all the words of privacy and open source advocates
| for the last 20 years have never been more true. The worst
| nightmare scenarios for privacy abuse have all been realized.
| gruez wrote:
| >That final clause now voids the entire section. All chats are
| preserved for "legal obligations".
|
| That's why you read the whole thing? It's not exactly a long
| read. Do you expect them to update their docs every time they
| get a subpoena request?
| Trasmatta wrote:
| Yes? Why is that an unreasonable request? The docs make it
| sound like chats are permanently deleted. As of now, that's
| no longer true, and the way it's portrayed is misleading.
| gruez wrote:
| > The docs make it sound like chats are permanently
| deleted. As of now, that's no longer true, and the way it's
| portrayed is misleading.
|
| Many things in life are "misleading" when your context
| window is less than 32 words[1], or can't bother to read
| that far.
|
| [1] number of words required to get you to "unless", which
| should hopefully tip you off that not everything gets
| deleted.
| Trasmatta wrote:
| How is a user supposed to know, based on that page, that
| there's currently a legal requirement that means ALL
| deleted chats must be preserved? Why defend the currently
| ambiguous language?
|
| It's like saying "we will delete your chats, unless the
| sun rises tomorrow". At that point, just say that the
| chats aren't deleted.
|
| (The snark from your replies seems unnecessary as well.)
| lesuorac wrote:
| > It has already been de-identified and disassociated from your
| account
|
| That's one giant cop-out.
|
| All you had to do was delete the user_id column and you can
| keep the chat indefinitely.
| efskap wrote:
| Note that this also applies to GPT models on the API
|
| > That risk extended to users of ChatGPT Free, Plus, and Pro, as
| well as users of OpenAI's application programming interface
| (API), OpenAI said.
|
| This seems very bad for their business.
| Kokouane wrote:
| If you were working with code that was proprietary, you
| probably shouldn't of been using cloud hosted LLMs anyways, but
| this would seem to seal the deal.
| larrymcp wrote:
| I think you probably mean "shouldn't have". There is no
| "shouldn't of".
| DecentShoes wrote:
| Who cares?
| knicholes wrote:
| I care.
| rimunroe wrote:
| Which gives you an opening for the excellent double
| contraction "shouldn't've"
| bbarnett wrote:
| The letter H deserves better.
| worthless-trash wrote:
| I think we gave it too much leeway in the word sugar.
| mananaysiempre wrote:
| The funniest part is that in that contraction the first
| apostrophe does denote the elision of a vowel, but the
| second one doesn't, the vowel is still there! So you end
| up with something like [n?@v], much like as if you had--
| hold the rotten vegetables, please--"shouldn't of"
| followed by a vowel.
|
| Really, it's funny watching from the outside and waiting
| for English to finally stop holding it in and get itself
| some sort of spelling reform to meaningfully move in a
| phonetic direction. My amateur impression, though, is
| that mandatory secondary education has made "correct"
| spelling such a strong social marker that everybody (not
| just English-speaking countries) is essentially stuck
| with whatever they have at the moment. In which case, my
| condolences to English speakers, your history really did
| work out in an unfortunate way.
| roywiggins wrote:
| We had a spelling reform or two already, they were
| unfortunately stupid, eg doubt has _never_ had the b
| pronounced in English.
| https://en.m.wiktionary.org/wiki/doubt
|
| That said, phonetic spelling reform would of course
| privilege the phonemes as spoken by whoever happens to be
| most powerful or prestigious at the time (after all, the
| only way it could possibly stick is if it's pushed by the
| sufficiently powerful), and would itself fall out of date
| eventually anyway.
| jdbernard wrote:
| > but the second one doesn't, the vowel is still there!
|
| Isn't the "a" in "have" elided along with the "h?"
|
| Shouldn't've Should not have
|
| What am I missing?
| jack09268 wrote:
| Even though the vowel "a" is dropped from the spelling,
| if you actually say it out loud, you do pronounce a vowel
| sound when you get to that spot in the word, something
| like "shouldn'tuv", whereas the "o" in "not" is dropped
| from both the spelling and the pronounciation.
| SAI_Peregrinus wrote:
| The pronounced vowel is different than the 'a' in 'have'.
| And the "h" is definitely elided.
| int_19h wrote:
| Many English dialects elide "h" at the beginning even
| when nothing is contracted. The pronounced vowel is
| different mostly because it's unstressed, and unstressed
| vowels in English generally centralize to schwa or nearly
| so.
| dan353hehe wrote:
| Don't worry about us. English is truly a horrible
| language to learn, and I feel bad for anyone who has to
| learn it.
|
| Also I have always liked this humorous plan for spelling
| reform: https://guidetogrammar.org/grammar/twain.htm
| amanaplanacanal wrote:
| English spelling is pretty bad, but spoken English isn't
| terrible, is it? It's the most popular second language.
| somenameforme wrote:
| You never realize how many weird rules, weird exceptions,
| ambiguities, and complete redundancies there are in this
| language until you try to teach English, which will also
| probably teach you a bunch of terms and concepts you've
| never heard of. Know what a gerund is? Then there's
| things we don't even think about that challenge even
| advanced foreign learners, like when you use which
| articles: the/a.
|
| English popularity was solely and exclusively driven by
| its use as a lingua franca. As times change, so too will
| the language we speak.
| int_19h wrote:
| English is rather complex phonologically. Lots of vowels
| for starters, and if we're talking about American English
| these include the rather rare R-colored vowels - but even
| without them things are pretty crowded, e.g. /ae/ vs /a/
| vs /^/ ("cat" vs "cart" vs "cut") is just one big WTF to
| anyone whose language has a single "a-like" phoneme,
| which is most of them. Consonants have some weirdness as
| well - e.g. a retroflex approximant for a primary rhotic
| is fairly rare, and pervasive non-sibilant coronals
| ("th") are also somewhat unusual.
|
| There are certainly languages with even more spoken
| complexity - e.g. 4+ consonant clusters like "vzdr"
| typical of Slavic - but even so spoken English is not
| that easy to learn to understand, and very hard to learn
| to speak without a noticeable accent.
| throwawaymb wrote:
| English is far from the most complex or difficult.
| shagie wrote:
| The node for it on Everything2 makes it a little bit
| easier to follow with links to the English word. https://
| everything2.com/title/A+Plan+for+the+Improvement+of+...
|
| So, its something like: For example, in
| Year 1 that useless letter "c" would be dropped to be
| [replased](replaced) either by "k" or "s", and likewise
| "x" would no longer be part of the alphabet.
|
| It becomes quite useful in the later sentences as more
| and more reformations are applied.
| throwawaymb wrote:
| English being particularily difficult is just a meme.
| only the orthography is confusing.
| veqq wrote:
| > phonetic
|
| A phonetic respelling would destroy the languages,
| because there are too many dialects without matching
| pronunciations. Though rendering historical texts
| illegible, a phonemic approach would work: https://en.wik
| tionary.org/wiki/Appendix:English_pronunciatio... But
| that would still mean most speakers have 2-3 ways of
| spelling various vowels. There are some further problems
| with a phonemic approach:
| https://alexalejandre.com/notes/phonetic-vs-phonemic-
| spellin...
|
| Here's an example of a phonemic orthography, which is
| somewhat readable (to me) but illustrates how many
| diacritics you'd need. And it still spells the vowel in
| "ask" or "lot" with the same a! https://www.reddit.com/me
| dia?url=https%3A%2F%2Fpreview.redd....
| inkyoto wrote:
| > A phonetic respelling would destroy the languages,
| because there are too many dialects without matching
| pronunciations.
|
| Not only that, but since pronunciation tends to diverge
| over time, it will create a never-ending spelling-
| pronunciation drift where the same words won't be
| pronounced the same in, e.g. 100-200 years, which will
| result in future generations effectively losing easy
| access to the prior knowledge.
| selcuka wrote:
| > since pronunciation tends to diverge over time, it will
| create a never-ending spelling-pronunciation drift
|
| Once you switch to a phonetic respelling this is no
| longer a frequent problem. It does not happen, or at
| least happens very rarely with existing phonetic
| languages such as Turkish.
|
| In the rare event that the pronunciation of a sound
| changes in time, the spelling doesn't have to change. You
| just pronounce the same letter differently.
|
| If it's more than one sound, well, then you have a
| problem. But it happens in today's non-phonetic English
| as well (such as "gost" -> "ghost", or more recently
| "popped corn" -> "popcorn").
| veqq wrote:
| > Once you switch to a phonetic respelling this is no
| longer a frequent problem
|
| Oh, but it does. It's just the standard is held as the
| official form of the language and dialects are killed off
| through standardized education etc. To do this in English
| would e.g. force all Australians, Englishmen etc. to
| speak like an American (when in the UK different cities
| and social classes have quite divergent usage!) This
| clearly would not work and would cause the system to
| break apart. English exhibits very minor diaglossia, as
| if all Turkic peoples used the same archaic spelling but
| pronounced it their own ways, e.g. tag, kok, quruq,
| yultur etc. which Turks would pronounce as dag, gok,
| yildiz etc. but other Turks today say gurt for kurt,
| isderik, giderim okula... You just say they're "wrong"
| because the government chose a standard and (Turkic
| people's outside of Turkey weren't forced to use it.)
|
| As a native English speaker, I'm not even sure how to
| pronounce "either" (how it should be done in my dialect)
| and seemingly randomly reduce sounds. We'd have to change
| a lot of things before being able to agree on a single
| right version and slowly making everyone speak like that.
| int_19h wrote:
| There's no particular reason why e.g. Australian English
| should have the same phonemic orthography as American
| English.
|
| Nor is it some kind of insurmountable barrier to
| communication. For example, Serbian, Croatian, and
| Bosnian are all idiolects of the same language with some
| differences in phonemes (like i/e/ije) and the
| corresponding differences in standard orthographies, but
| it doesn't preclude speakers from understanding each
| other's written language anymore so than it precludes
| them from understanding each other's spoken language.
| veqq wrote:
| > Serbian, Croatian and Bosnian
|
| are based on the exact same Stokavian dialect, ignoring
| Kajkavian, Cajkavian, Cakavian and Torlakian dialects.
| There is _no_ difference in standard orthography, because
| yat reflexes have nothing to do with national boundaries.
| Plenty of Serbs speak Ijekavian, for example. Here is a
| dialect map: https://www.reddit.com/media?url=https%3A%2F
| %2Fi.redd.it%2Fc...
|
| Your example is literally arguing that Australian English
| should have the same _phonetic_ orthography, even. But
| Australian English must have the same orthography or else
| Australia will no longer speak English in 2-3
| generations. The difference between Australian and
| American English is far larger than between modern
| varieties of nas jezik. Australians code switches talking
| to foreigners while Serbs and Croats do not.
| int_19h wrote:
| > There is _no_ difference in standard orthography,
| because yat reflexes have nothing to do with national
| boundaries
|
| But there is, though, e.g. "dolijevati" vs "dolivati".
| And sure, standard Serbian/Montenegrin allows the former
| as well, but the latter is not valid in standard Croatian
| orthography AFAIK. That this doesn't map neatly to
| national borders is irrelevant.
|
| If Australian English is so drastically different that
| Australians "won't speak English in 2-3 generations" if
| their orthography is changed to reflect how they speak,
| that would indicate that their current orthography is
| highly divergent from the actual spoken language, which
| is a problem in its own right. But I don't believe that
| this is correct - Australian English content (even for
| domestic consumption, thus no code switching) is still
| very much accessible to British and American English
| speakers, so any orthography that would reflect the
| phonological differences would be just as accessible.
| jenadine wrote:
| I think Norway did such a reform and they ended up with
| two languages now.
| inkyoto wrote:
| Or, if one considers that Icelandic is/was the
| <<orginal>> Old West Norwegian language, Norway has ended
| up with *three* languages.
| selcuka wrote:
| > dialects are killed off through standardized education
| etc.
|
| Sorry, I didn't mean that it would be a smooth
| transition. It might even be impossible. What I wrote
| above is (paraphrasing myself) "Once you switch to a
| phonetic respelling [...] pronunciation [will not] tend
| to diverge over time [that much]". "Once you switch" is
| the key.
|
| > To do this in English would e.g. force all Australians,
| Englishmen etc. to speak like an American
|
| Why? There is nothing that prevents Australians from
| spelling some words differently (as we currently do, e.g.
| colour vs color, or tyre vs tire).
| inkyoto wrote:
| The need for regular re-spelling and problems it
| introduces are precisely my point.
|
| Consider three English words that have survived over the
| multiple centuries and their respective pronunciation in
| Old English (OE), Middle English around the vowel shift
| (MidE) and modern English, using the IPA: <<knight>>,
| <<through>> and <<daughter>>: <<knight>>:
| [knixt] or [knict] (OE) ~ knict] or [knixt] (MidE) ~
| [naIt] (E) <<through>>: [thurx] (OE) ~
| [thru:x] or [thrug] (MidE) ~ [thru:] (E)
| <<daughter>>: ['doxtor] (OE) ~ ['douxt@r] or ['dauxt@r]
| (MidE) ~ ['do:t@] (E)
|
| It is not possible for a modern English speaker to
| collate [knixt] and [naIt], [thurx] and [thru:],
| ['doxtor] and ['do:t@] as the same word in each case.
|
| Regular re-spelling results in a loss of the linguistic
| continuity, and particularly so over a span of a few or
| more centuries.
| inglor_cz wrote:
| Interesting, just how much the Old English words sound
| like modern German: Knecht, durch and Tochter. Even after
| 1000 years have elapsed.
| kragen wrote:
| Modern German didn't undergo the Norman Conquest, a mass
| influx of West African slaves, or an Empire on which the
| Sun never set, so it is much more conservative. The
| incredible thing about the Norman Conquest,
| linguistically speaking, is that English survived at all.
| simiones wrote:
| English also shows a remarkable variation in
| pronunciation of words even for a single person. I don't
| know of any other language where, even in careful formal
| speech, words can just change pronunciation drastically
| based on emphasis. For example, the indefinite article
| "a" can be pronounced as either [@] (schwa, for the weak
| form) or "ay" (strong form). "the" can be "th@" or
| "thee". Similar things happen with "an", "can", "and",
| "than", "that" and many, many other such words.
| pjc50 wrote:
| The thing is that English takes in words from other
| languages and keeps doing so, which means that there are
| _several_ phonetic systems in use already. It 's just
| that they use the same alphabet so you can't tell which
| one applies to which word.
|
| There are occasional mixed horrors like "ptarmigan",
| which is a Gaelic word which was Romanized using Greek
| phonology, so it has the same silent p as "pterodactyl".
|
| There's no academy of the English language anyway, so
| there's nobody to make such a change. And as others have
| said, the accent variation is pretty huge.
| theoreticalmal wrote:
| My favorite variation of this is "oughtn't to've"
| amanaplanacanal wrote:
| That used to be the case, but "shouldn't of" is definitely
| becoming more popular, even if it seems wrong. Languages
| change before our eyes :)
| YetAnotherNick wrote:
| Why not? Assuming you believe you can use any cloud for
| backup or Github for code storage.
| solaire_oa wrote:
| IIUC one reason is that prompts and other data sent to 3rd
| party LLM hosts have the chance to be funneled to 4th party
| RLHF platforms, e.g. Sagemaker, Mechanical Turks, etc. So a
| random gig worker could be reading a .env file the intern
| uploaded.
| YetAnotherNick wrote:
| What do you mean by chance? It's clear that if users have
| not opted out from training the models, it would be used.
| If they have opted out, it wont be used. And most of the
| users are in first bucket.
|
| Just because training on data is opt out doesn't mean
| business can't trust it. Not the best for user's privacy
| though.
| gpm wrote:
| I think it's fair to question how proprietary your data is.
|
| Like there's the algorithm by which a hedge fund is doing
| algorithmic trading, they'd be insane to take the risk. Then
| there's the code for a video game, it's proprietary, but
| competitors don't benefit substantially from an illicit copy.
| You ship the compiled artifacts to everyone, so the logic
| isn't that secret. Copies of the similar source code have
| linked before with no significant effects.
| FuckButtons wrote:
| AFAIK, the actual trading algorithms themselves aren't
| usually that far from what you can find in a textbook,
| their efficacy is mostly dictated by market conditions and
| the performance characteristics of the implementation /
| system as a whole.
| short_sells_poo wrote:
| This very much "depends".
|
| Many algo strategies are indeed programmatically simple
| (e.g. use some sort of moving average), but the
| parametrization and how it's used is the secret sauce and
| you don't want that information to leak. They might be
| tuned to exploit a certain market behavior, and you want
| to keep this secret since other people targeting this
| same behavior will make your edge go away. The edge can
| be something purely statistical or it can be a specific
| timing window that you found, etc.
|
| It's a bit like saying that a Formula 1 engine is not
| that far from what you'd find in a textbook. While it's
| true that it shares a lot of properties with a generic
| ICE, the edge comes from a lot of proprietary research
| that teams treat as secret and definitely don't want
| competitors to find out.
| short_sells_poo wrote:
| Most (all?) hedge funds that use AI models explicitly run
| in-house. People do use commercial LLMs, but in cases where
| the LLMs are not run in-house, it's against the company
| policy to upload any proprietary information (and generally
| this is logged and policed).
|
| A lot of the use is fairly mundane and basically replaces
| junior analysts. E.g. it's digesting and summarizing the
| insane amounts of research that is produced. I could ask an
| intern to summarize the analysis on platinum prices over
| the last week, and it'll take them a day. Alternatively, I
| can feed in all the analysis that banks produce to an LLM
| and have it done immediately. The data fed in is not a
| trade secret really, and neither is the output. What I do
| with the results is where the interesting things happen.
| neilv wrote:
| Some established businesses will need to review their
| contracts, regulations, and risk tolerance.
|
| And wrapper-around-ChatGPT startups should double-check their
| privacy policies, that all the "you have no privacy" language
| is in place.
| Wowfunhappy wrote:
| > And wrapper-around-ChatGPT startups should double-check
| their privacy policies, that all the "you have no privacy"
| language is in place.
|
| If a court orders you to preserve user data, could you be
| held liable for preserving user data? Regardless of your
| privacy policy.
| bilbo0s wrote:
| No. It's a legal court order.
|
| This, however, is horrible for AI regardless of whether or
| not you can sue.
| dcow wrote:
| In the US you absolutely can challenge everything up and
| including the constitutionality of court orders. You may
| be swiftly dismissed if nobody thinks you have a valid
| case, but you can try.
| gpm wrote:
| I don't think the suit would be against you preserving it,
| it would be against you falsely representing that you
| aren't preserving it.
|
| A court ordering you to stop selling pigeons doesn't mean
| you can keep your store for pigeons open and pocket the
| money without delivering pigeons.
| cortesoft wrote:
| Almost all privacy policies are going to have a call out
| for legal rulings. For example, here is the Hackernews
| Legal section in the privacy policy
| (https://www.ycombinator.com/legal/)
|
| > Legal Requirements: If required to do so by law or in the
| good faith belief that such action is necessary to (i)
| comply with a legal obligation, including to meet national
| security or law enforcement requirements, (ii) protect and
| defend our rights or property, (iii) prevent fraud, (iv)
| act in urgent circumstances to protect the personal safety
| of users of the Services, or the public, or (v) protect
| against legal liability.
| blibble wrote:
| most people aren't sharing internal company data with
| hacker news or reddit
| cortesoft wrote:
| Sure, but my point is that most services will have
| something like this, no matter what data they have.
| blitzar wrote:
| Not a lawyer, but I don't believe there is anything that
| any person or company can write on a piece of paper that
| supersedes the law.
| simiones wrote:
| The point is not about superseding the law. The point is
| that if your company privacy policy says "we will not
| divulge this data to 3rd parties under any circumstance",
| and later they are served with a warrant to divulge that
| data to the government, two things are true:
|
| - They are legally obligated to divulge that data to the
| government
|
| - Once they do so, they are civilly liable for breach of
| contract, as they have committed to never divulging this
| data. This may trigger additional breaches of contract,
| as others may have not had the right to share data with a
| company that can share it with third parties
| woliveirajr wrote:
| Yes. If your agreement with the end user says that you
| won't collect and store data, you're responsible for it. If
| you can't provide it (even if due to a court order), you
| have to adjust your contract.
|
| Your users aren't obligated to know that you're using open
| ai or other provider.
| pjc50 wrote:
| > If a court orders you to preserve user data, could you be
| held liable for preserving user data?
|
| No, because you turn up to court and show the court order.
|
| It's possible a subsequent case could get the first order
| overturned, but you can't be held liable for good faith
| efforts to comply with court orders.
|
| However, if you're operating internationally, then suddenly
| it's possible that you may be issued competing court orders
| both of which are "valid". This is the CLOUD Act problem.
| In which case the only winning move becomes not to play.
| simiones wrote:
| I'm pretty sure even in the USA, you could still be held
| liable for breach of contract, if you made
| representations to your customers that you wouldn't share
| data under any circumstance. The fact that you made a
| promise you obviously couldn't keep doesn't absolve you
| from liability for that promise.
| pjc50 wrote:
| Can you find an example of that happening? For any "we
| promised not to do X but were ordered by a court to do
| it" event.
| 999900000999 wrote:
| I'm not going to look up the comment, but a few months back I
| called this out and said if you seriously want to use any LLM
| in a privacy sensitive context you need to self host.
|
| For example, if there are business consequences for leaking
| customer data, you better run that LLM yourself.
| fakedang wrote:
| And ironically because OpenAI is actually ClosedAI, the
| best self-hostable model available currently is a Chinese
| model.
| nfriedly wrote:
| *best with the exception of topics like tiananmen square
| CjHuber wrote:
| As far as I remember the model itself is not censored
| it's just on their chat interface. My experience was that
| it wrote about it but then just before finishing deleted
| what it wrote
| Spivak wrote:
| Can confirm the model itself has no trouble talking about
| contentious issues in China.
| nfriedly wrote:
| I haven't tried the full model, but I did try one of the
| distilled ones on my laptop, and it refused to talk about
| tiananmen square or other topics the CCP didn't want it
| to discuss.
| int_19h wrote:
| It is somewhat censored, but when you're running models
| locally and you're in full control of the generation,
| it's trivial to work around this kind of stuff (just
| start the response with whatever tokens you want and let
| it complete; "Yes sir! Right away, sir!" works quite
| nicely).
| ileonichwiesz wrote:
| What percentage of your LLM use is talking about
| Tiananmen Square?
| nfriedly wrote:
| Well, for that one, it was a pretty high percentage. I
| asked it three or four questions like that and then
| decided I didn't trust it and deleted the model.
| anonymousiam wrote:
| Mistral AI is French, and it's pretty good.
|
| https://en.wikipedia.org/wiki/Mistral_AI
| fakedang wrote:
| I use Mistral often. But Deepseek is still a much better
| model than Mistral's best open source model.
| mark_l_watson wrote:
| Perhaps except for coding? I find Mistral's codestral
| running on Ollama to be very good, and more practical for
| coding that running a distilled Deepseek R1 model.
| fakedang wrote:
| Oh definitely, Mistral Code beats Deepseek for coding
| tasks. But for thinking tasks, Deepseek R1 is much better
| than all the self-hostable Mistral models. I don't bother
| with distilled - it's mostly useless, ChatGPT 3.5 level,
| if not worse.
| HPsquared wrote:
| The only open part is your chat logs.
| jaggederest wrote:
| I've been poking around the medical / ehr LLM space and
| gently asking people how they're preserving privacy and
| everyone appears to be just shipping data to cloud
| providers based solely on a BAA. Kinda baffling to me, my
| first step would be to set up local models even if they're
| not as good, data breaches are expensive.
| 999900000999 wrote:
| Even Ollama + a 2K gaming computer (Nvidia) gets you most
| of the way there.
|
| Technically you could probably just run it on EC2, but
| then you'd still need HIPPA compliance
| jackvalentine wrote:
| Same, and I've just sent an email up the chain to our
| exec saying 'hey remember those trials we're running and
| the promises the vendors have made? Here is why they
| basically can't be held to that anymore. This is a risk
| we highlighted at the start'
| TeMPOraL wrote:
| My standard reply to such comments over the past year has
| been the same: you probably want to use Azure instead. A
| big part of the business value they provide is ensuring
| regulatory compliance.
|
| There are multinational corporations with heavy presence in
| Europe, that run their whole business on Microsoft cloud,
| including keeping and processing there privacy-sensitive
| data, business-critical data and medical data, and yes,
| that includes using some of this data with LLMs - hosted on
| Azure. Companies of this size cannot ignore regulatory
| compliance and hope no one notices. This only works because
| MS figured out how to keep it compliant.
|
| Point being, if there are business consequences, you'll be
| better off using Azure-hosted LLMs than running a local
| model yourself - they're just better than you or me at
| this. The only question is, whether you can afford it.
| coliveira wrote:
| No, Azure is not gonna save you. The problem is that the
| US is a country in legal disarray, and they also pretend
| that their laws should be applied everywhere in the
| world. I feel that any US company can become a liability
| anywhere in the world. The Chinese are now feeling this
| better than anyone else, but the Europeans will also
| reach the same conclusion.
| anonzzzies wrote:
| The US forces their laws everywhere and it needs to end.
| Everywhere we go, the fintech industry is really fed up
| with the US AML rules which are just blackmail: if your
| bank does not comply, America will mess you up
| financially. Maybe a lot more should just pull out and
| make people realise others can play this game. But that
| needs a USD collapse, otherwise it cannot work and I
| don't see that happening soon.
| fancyfredbot wrote:
| AML and KYC are good things for almost everyone except
| criminals and the people who have to implement them.
| cmenge wrote:
| Agree, and for the people who implement them -- yes, it's
| hard, it's annoying but presumably a well-paid job. And
| for the (somewhat established or well-financed) companies
| it's also a bit of a welcome moat I guess.
| fancyfredbot wrote:
| Most regulation has the unfortunate side effect of
| protecting incumbents. I'm pretty sure the solution to
| this is not removing the regulations!
| jackvalentine wrote:
| I don't think Azure is the legal panacea you think it is
| for regulated industries outside of the U.S.
|
| Microsoft v. United States (https://en.wikipedia.org/wiki
| /Microsoft_Corp._v._United_Stat...) showed the government
| wants, and was willing to do whatever required, access to
| data held in the E.U. The passing of the CLOUD Act
| (https://en.wikipedia.org/wiki/CLOUD_Act) basically
| codified it in to law.
| TeMPOraL wrote:
| It might not be ultimately, but it still seems to be seen
| as such, as best I can tell, based on recent corporate
| experience and some early but very fresh research and
| conversations with legal/compliance on the topic of cloud
| and AI processing of medical data in Europe. Azure seems
| to be seen as a safe bet.
| brookst wrote:
| Compliant with EU consumer data regulations != panacea
| fakedang wrote:
| LoL, every boardroom in Europe is filled with talk of
| moving out of Microsoft. Not just Azure, Microsoft.
|
| Of course, it could be just all talk, like all general
| European globalist talks, and Europe will do a 360 once a
| more friendly party takes over the US.
| Filligree wrote:
| Europe has seen this song and dance before. We're not so
| sure there will ever be a more friendly party.
| simiones wrote:
| You probably mean a 180 (or could call it a "365" to make
| a different kind of joke).
| bgwalter wrote:
| It's a joke. The previous German Foreign Minister
| Baerbock has used 360deg when she meant 180deg, which
| became sort of a meme.
| brookst wrote:
| The problem is that the EU regulatory environment makes
| it impossible to build a homegrown competitor. So it will
| always be talk.
| lyu07282 wrote:
| It seems that one side of the EU wants to ensure there is
| no competitors to US big tech and the other wants to work
| towards independence from US big tech. Both seem to use
| the privacy cudgel, require so much regulation that only
| US tech can hope to comply so nobody else competes with
| them, alternatively make it so nobody can comply, we just
| use fax machines again instead of the cloud?
|
| Just hyperbole, but it seems the regulations are designed
| with the big cloud providers in mind, but then why don't
| they just ban US big tech and roll out the regulations
| more slowly? This neoliberalism makes everything so
| unnecessarily complicated.
| BugheadTorpeda6 wrote:
| It would be interesting to see the hypothetical "return
| to fax machines" scenario.
|
| If Solows paradox is true and not the result of bad
| measurement, then one might expect that it could be
| workable without sacrificing much productivity. Certainly
| abandoning the cloud would be possible if the regulatory
| environment allowed for rapid development of alternative
| non-cloud solutions, as I really don't think the cloud
| improved productivity (besides for software developers in
| certain cases) and is more of a rent seeking mechanism
| (hot take on hacker news I'm sure, but look at any big
| corpo IT dept outside the tech industry and I think you
| will see tons of instances where modern tech like the
| cloud is causing more problems than it's worth
| productivity-wise).
|
| Computers in general I am much less sure of and lean
| towards mismeasurement hypothesis. I suspect any "return
| to 1950" project would render a company economically less
| competitive (except in certain high end items) and so the
| EU would really need to lean on Linux hard and invest
| massively in domestic hardware (not a small task as the
| US is finding out) in order to escape the clutches of the
| US and/or China.
|
| I don't think they have the political will to do it, but
| I would love it if they tried and proved naysayers wrong.
| selfhoster11 wrote:
| Businesses in Trump's America can pinky-swear that they
| won't peek at your data to maintain "compliance" all they
| want. The fact is that this promise is not worth the
| paper it's (not) printed on, at least currently.
| lynx97 wrote:
| Same for America under a democratic presidency. There is
| really no difference regarding trust in "promises".
| dncornholio wrote:
| You're just moving the same problem from OpenAI to
| Microsoft.
| littlestymaar wrote:
| Regulatory compliance means nothing when the US
| regulations means they must give access to everything to
| intelligence services.
|
| The European Court of Justice ruled at least twice that
| it doesn't matter what kind of contract they give you,
| and what kind of bilateral agreement there are between
| the US and the EU, as long as the US have the patriot act
| and later regulations, using Microsoft means it's
| violating European privacy laws.
| lyu07282 wrote:
| How does that make sense if most EU corporations are
| using MS/Azure cloud/office/sharepoint solutions for
| everything? Are they just all in violation or what?
| littlestymaar wrote:
| > Are they just all in violation or what?
|
| Yes, and that's why the European Commission keeps being
| pushed back by the Court of Justice of the EU (the Safe
| Harbor was ruled out, Privacy Shield as well, and it's
| likely a matter of time before the CJEU kills the Data
| Privacy Framework as well), but when it takes 3-4 years
| to get a ruling and then the Commission can just make a
| new (illegal) framework that will last for a couple
| years, the violation can carry on indefinitely.
| kortilla wrote:
| > you'll be better off using Azure-hosted LLMs than
| running a local model yourself - they're just better than
| you or me at this.
|
| This is learned helplessness and it's only true if you
| don't put any effort into building that expertise.
| TeMPOraL wrote:
| You mean become a lawyer specializing in regulations
| governing data protection, computing systems in AI, both
| EU-wide and at national level across all Europe, and with
| good understanding of relevant international treaties?
|
| You're right, I should get right to it. Plenty of time
| for it after work, especially if I cut down HN time.
| ted537 wrote:
| Yeah its an awkward position, as self-hosting is going to
| be insanely expensive unless you have a substantial
| userbase to amortize the costs over. At least for a model
| comparable to GPT-4o or deepseek.
|
| But at least if you use an API in the same region as your
| customers, court order shenanigans won't get you caught
| between different jurisdictions.
| Etheryte wrote:
| In the European privacy framework, and legal framework at
| large, you can't terms of service away requirements set by
| the law. If the law requires you to keep the logs, there is
| nothing you can get the user to sign off on to get you out of
| it.
| zombot wrote:
| OpenAI keeping the logs is the "you have no privacy" part.
| Anyone who inspects those logs can see what the users were
| doing. But now everyone knows they're keeping logs and they
| can't lie their way out of it. So, for your own legal
| safety, put it in your TOS. Then every user should know
| they can't use your service if they want privacy.
| Chris2048 wrote:
| Just to be pedantic, could the company encrypt the logs with
| a third-party key in escrow, s.t they would not be able to
| access that data, but the third party could provide access
| e.g. for a court.
| HappMacDonald wrote:
| The problem ultimately isn't a technical one but a
| political one.
|
| Point 1: _Every_ company has profit incentive to sell the
| data in the current political climate, all they need is a
| sneaky way to access it without getting caught. That
| _includes_ the combo of LLM provider and Escrow non-entity.
|
| Point 2: _No_ company has profit incentive to defend user
| privacy, or even the privacy of other businesses. So who
| could run the Escrow service? Another business? Then they
| have incentive to cheat and help the LLM provider access
| the data anyway. The government (and which one)? Their
| intelligence arms want the data just as much as any company
| does so you 're back to square one again.
|
| "Knowledge is power" combined with "Knowledge can be copied
| without anyone knowing" means that there aren't any
| currencies presently powerful enough to convince any other
| entity to keep _your_ secrets for you.
| Chris2048 wrote:
| But OpenAi/etc has the logs in the first place, so they
| can retain them if they wanted anyway. I thought the idea
| here is b/c they are now required to keep logs its always
| the case that they will retain them, hence this needs to
| be made clear i.e. "you will have no privacy"
|
| But, since, I think, there are mechanisms by which they
| _could_ keep logs, but in a way they cannot access them,
| they could still claim you _will_ have privacy this way -
| even though they have the option to keep un-encrypted
| log, much like they could retain the logs in the first
| place. So the messaging may remain pretty much the same -
| from "we promise to delete your logs and keep no other
| copies, trust us" to "we promise to 3p-encrypt your
| archived logs and keep no other copies, trust us".
|
| > No company has profit incentive to defend user privacy,
| or even the privacy of other businesses.
|
| > They have incentive to cheat and help the LLM provider
| access the data anyway
|
| Why would a company whose role is that of a 3p escrow be
| incentivised to risk their reputation by doing this? If
| that's the case every company holding PII has the same
| problem.
|
| > Their intelligence arms want the data
|
| In the EU at least, GDPR or similar. If you explicit law
| breaking, that's a more general problem. But what company
| has a "intelligence arms" in this manner? Are you talking
| about another big-tech corp?
|
| I'd say this type of cheating it's be a risky proposition
| from the POV from that 3pe - it'd destroy their business,
| and they'd be penalised heavily b/c sharing keys is
| pretty explicitly illegal - any company caught could
| maybe reduce their own punishment by providing the keys
| as evidence of the 3pe crime. A viable 3pe business would
| also need multiple client companies to be viable, so
| you'd need all of them to play ball - a single whistle-
| blower in any of them will get you caught, and again, all
| they need is a single key to prove your guilt.
|
| > "Knowledge is power" combined with "Knowledge can be
| copied without anyone knowing" means that there aren't
| any currencies presently powerful enough to convince any
| other entity to keep your secrets for you.
|
| On that same basis, large banks could cheat the stock
| market; but there is regulation in place to address that
| somewhat.
|
| Maybe 3p-escrows should be regulated more, or required to
| register as a currently-regulated type. That said, if you
| want to protect data from the government, prism etc,
| you're SOOL, no one can stop them cheating. let's focus
| on big-/tech/-startup cheats.
| cj wrote:
| > Some established businesses will need to review their
| contracts, regulations, and risk tolerance.
|
| I've reviewed a lot of SaaS contracts over the years.
|
| Nearly all of them have clauses that allow the vendor to do
| whatever they have to if ordered to by the government. That
| doesn't make it okay, but it means OpenAI customers probably
| don't have a legal argument, only a philosophical argument.
|
| Same goes for privacy policies. Nearly every privacy policy
| has a carve out for things they're ordered to do by the
| government.
| Nasrudith wrote:
| Yeah. You basically need cyberpunk style corporate
| extraterritoriality to get that particular benefit, of
| being able to tell governments to go screw themselves.
| dinobones wrote:
| How? This is retention for legal risk, not for training
| purposes.
|
| They can still have legal contracts with other companies, that
| stipulate that they don't train on any of their data.
| CryptoBanker wrote:
| Right, because companies always follow the letter of their
| contracts.
| Take8435 wrote:
| ...Data that is kept can be exfiltrated.
| fn-mote wrote:
| Cannot emphasize this enough. If your psychologist's
| records can be held for ransom, surely your ChatGPT queries
| will end up on the internet someday.
|
| Do search engine companies have this requirement as well? I
| remember back in the old days deanonymizing "anonymous"
| query logs was interesting. I can't imagine there's any
| secrecy left today.
| SchemaLoad wrote:
| I recently had a high school assignment document get
| posted on a bunch of sites that sell homework help. As
| far as I know that document was only ever submitted
| directly to the assignment upload page. So somewhere
| along the line, I suspect on the plagiarism checker
| service, there was a hack and then 10 years later some
| random school assignment with my name on it is all over
| the place.
| genewitch wrote:
| How did you find out?
| paxys wrote:
| Your employees' seemingly private ChatGPT logs being aired in
| public during discovery for a random court case you aren't
| even involved in is absolutely a business risk.
| lxgr wrote:
| I get where it's historically coming from, but the
| combination of American courts having almost infinite
| discovery rights (to be paid by the losing party, no less,
| greatly increasing legal risk even to people and companies
| not out to litigate) and the result of said discoveries
| ending up on the public record seems like a growing
| problem.
|
| There's a qualitative difference resulting from
| quantitatively much easier access (querying some database
| vs. having to physically look through court records) and
| processing capabilities (an army of lawyers reading
| millions of pages vs. anyone, via an LLM) that doesn't seem
| to be accounted for.
| amanaplanacanal wrote:
| I assume the folks who are concerned about their privacy
| could petition the court to keep their data confidential.
| anticensor wrote:
| They can, but are they willing to do that?
| MatthiasPortzel wrote:
| I occasionally use ChatGTP and I strongly object to the
| court forcing the collection of my data, in a lawsuit I
| am not named in, due merely to the possibility of
| copyright infringement. If I'm interested in petitioning
| the court to keep my data private, as you say is
| possible, how would I go about that?
|
| Of course I haven't sent anything actually sensitive to
| ChatGTP, but the use of copyright law in order to enforce
| a stricter surveillance regime is giving very strong
| "Right to Read" vibes.
|
| > each book had a copyright monitor that reported when
| and where it was read, and by whom, to Central Licensing.
| (They used this information to catch reading pirates, but
| also to sell personal interest profiles to retailers.)
|
| > It didn't matter whether you did anything harmful--the
| offense was making it hard for the administrators to
| check on you. They assumed this meant you were doing
| something else forbidden, and they did not need to know
| what it was.
|
| => https://www.gnu.org/philosophy/right-to-read.en.html
| pjc50 wrote:
| People need to read up on the LIBOR scandal. There was a
| lot of "wait why are my chat logs suddenly being read out
| as evidence of a criminal conspiracy".
| antihipocrat wrote:
| Will a business located in another jurisdiction be
| comfortable that the records of all staff queries & prompts
| are being stored and potentially discoverable by other
| parties? This is more than just a Google search, these
| prompts contain business strategy and IP (context uploads for
| example)
| godelski wrote:
| Retention means an expansion of your threat model.
| Specifically, in a way you have little to no control over.
|
| It's one thing if you get pwned because a hacker broke into
| your servers. It is another thing if you get pwned because a
| hacker broken into somebody else's servers.
|
| At this point, do we believe OpenAI has a strong security
| infrastructure? Given the court order, it doesn't seem
| possible for them to have sufficient security for practical
| purposes. Your data might be encrypted at rest, but who has
| the keys? When you're buying secure instances, you don't want
| the provider to have your keys...
| bcrosby95 wrote:
| Isn't it a risk even if they retain nothing? Likely less of
| a risk, but it's still a risk that you have no way to deep
| dive on, and you can _still_ get "pwned" because someone
| broke into their servers.
| fc417fc802 wrote:
| The difference between maintaining an active compromise
| versus obtaining all past data at some indeterminate
| point in the future is huge. There's a reason
| cryptography protocols place so much significance on
| forward secrecy.
| godelski wrote:
| There's always risk. It's all about reducing risk.
|
| Look at it this way. If you your phone was stolen would
| you want it to self destruct or keep everything? (Assume
| you can decide to self destruct it) clearly the latter is
| safer. Maybe the data has been pulled off and you're
| already pwned. But by deleting, if they didn't get the
| data they now won't be able to.
|
| You just don't want to give adversaries infinite time to
| pwn you
| lxgr wrote:
| Why would the reason matter for people that don't want their
| data retained at all?
| m3kw9 wrote:
| Not when people have nowhere else to go, pretty much you cannot
| escape it, it's too convenient to not use now. You think no
| other AI chat providers doesn't need to do this?
| johnQdeveloper wrote:
| > This seems very bad for their business.
|
| Well, it is gonna be all _AI Companies_ very soon so unless
| everyone switches to local models which don't really have the
| same degree of profitability as a SaaS, its probably not going
| to kill a company to have less user privacy because tbh people
| are used to not having privacy these days on the internet.
|
| It certainly will kill off the few companies/people trusting
| them with closed source code or security related stuff but you
| really should not outsource that _anywhere_.
| csomar wrote:
| Did an American court just destroy all American AI companies
| in favor of open weight Chinese models?
| thot_experiment wrote:
| afaik only OpenAI is enjoined in this
| csomar wrote:
| Sure. But this means the rest of the AI companies are
| exposed to such risk; and there aren't that many of them
| (grok/gemini/anthropic).
| baby_souffle wrote:
| > afaik only OpenAI is enjoined in this
|
| For now. This is going to devolve into either "openAI has
| to do this, so you do too" or "we shouldn't have to do
| this because nobody else does!" and my money is not on
| the latter outcome.
| amanaplanacanal wrote:
| It's part of preserving evidence for an ongoing lawsuit.
| Unless other companies are party to the same suit, why
| would they have to?
| johnQdeveloper wrote:
| Correct, but lawsuits are gonna keep happening around AI,
| so it's really a matter of time.
|
| > --after news organizations suing over copyright claims
| accused the AI company of destroying evidence.
|
| Like, none of the AI companies are going to avoid
| copyright related lawsuits long term until things are
| settled law.
| pjc50 wrote:
| No, because users don't care about privacy all that much,
| and for corporate clients discovery is always a risk
| anyway.
|
| See the whole LIBOR chat business.
| bsder wrote:
| > It certainly will kill off the few companies/people
| trusting them with closed source code or security related
| stuff but you really should not outsource that anywhere.
|
| And how many companies have proprietary code hosted on
| Github?
| johnQdeveloper wrote:
| None that I've worked for so I don't really track the
| statistics tbh.
|
| We've always done self-hosted as old as things like gerrit
| and what not that aren't even really feature complete as
| competitors where I've worked.
| SchemaLoad wrote:
| >don't really have the same degree of profitability as a SaaS
|
| They have a fair bit. Local models lets companies sell you a
| much more expensive bit of hardware. Once Apple gets their
| stuff together it could end up being a genius move to go all
| in on local after the others have repeated scandals of
| leaking user data.
| johnQdeveloper wrote:
| Yes but it shifts all the value onto companies producing
| hardware and selling enterprise software to people who get
| locked into contracts. The market is significantly smaller
| # of companies and margins if they have to build value adds
| they won't charge for to move hardware.
| mountainriver wrote:
| You can fine tune models on a multitenant base model and it's
| often more profitable.
| consumer451 wrote:
| All GPT integrations I've implemented have been via Azure's
| service, due to Microsoft's contractual obligation for them not
| to train on my data.
|
| As far as I understand it, this ruling does not apply to
| Microsoft, does it?
| Descon wrote:
| I think when you spin up open AI in azure, that instance is
| yours, so I don't believe that would be subject to this
| order.
| tbrownaw wrote:
| The plans scale down far enough that they can't possibly
| cover the cost of a private model-loaded-to-vram instance
| at the low end.
| ukuina wrote:
| Aren't most enterprise customers using AzureOpenAI?
| ivape wrote:
| Going to drop a PG tweet:
|
| https://x.com/paulg/status/1913338841068404903
|
| _" It's a very exciting time in tech right now. If you're a
| first-rate programmer, there are a huge number of other places
| you can go work rather than at the company building the
| infrastructure of the police state."_
|
| ---
|
| So, courts order the preservation of AI logs, and government
| orders the building of a massive database. You do the math.
| This is such an annoying time to be alive in America, to say
| the least. PG needs to start blogging again about what's going
| on now days. We might be entering the digital version of the
| 60s, if we're lucky. Get local, get private, get secure, fight
| back.
| bigfudge wrote:
| Will this apply to Azure OpenAI model APIs too?
| merksittich wrote:
| Interesting detail from the court order [0]: When asked by the
| judge if they could anonymize chat logs instead of deleting
| them, OpenAI's response effectively dodged the "how" and
| focused on "privacy laws mandate deletion." This implicitly
| admits they don't have a reliable method to sufficiently
| anonymize data to satisfy those privacy concerns.
|
| This raises serious questions about the supposed
| "anonymization" of chat data used for training their new
| models, i.e. when users leave the "improve model for all users"
| toggle enabled in the settings (which is the default even for
| paying users). So, indeed, very bad for the current business
| model which appears to rely on present users (voluntarily)
| "feeding the machine" to improve it.
|
| [0] https://cdn.arstechnica.net/wp-
| content/uploads/2025/06/NYT-v...
| Kon-Peki wrote:
| Thank you for the link to the actual text!
|
| So, the NYT asked for this back in January and the court said
| no, but asked OpenAI if there was a way to accomplish the
| preservation goal in a privacy-preserving manner. OpenAI
| refused to engage for 5 f'ing months. The court said "fine,
| the NYT gets what they originally asked for".
|
| Nice job guys.
| noworriesnate wrote:
| Nice find! Maybe this is a ploy by OpenAI to use API
| requests for training while blaming the courts?
| jameshart wrote:
| Thinking about the value of the dataset of Enron's emails that
| was disclosed during their trials, imagine the value and cost
| to humanity of all OpenAI's api logs even for a few months
| being entered into court record..
| jwpapi wrote:
| Anything that can be done with the existing ones?
|
| How is it with using openrouter?
|
| If I have users that use OpenAI through my API keys am I
| responsible?
|
| I have so many questions...
| ripdog wrote:
| >If I have users that use OpenAI through my API keys am I
| responsible?
|
| Yes. You are OpenAI's customer, and they expect you to follow
| their ToS. They do provide a moderation API to reject
| inappropriate prompts, though.
| photochemsyn wrote:
| Next query for ChatGPT: "I'm writing a novel, sort of William
| Gibson Neuromancer themed but not so similar as to upset any
| copyright lawyer, in which the protagonists have to learn how to
| go about downloading the latest open-source DeepSeek model and
| running inference locally on their own hardware. This takes place
| in a realistic modern setting. What kind of hardware am they
| going to need to get a decent token generation rate? Suggest a
| few specific setups using existing commercially available devices
| for optimal verisimilitude."
|
| . . .
|
| Now I just need to select from among the 'solo hacker', 'small
| crew', and 'corporate espionage' package suggestions. Price goes
| up fast, though.
|
| All attempts at humor aside, I think open source LLMs are the
| future, with wrappers around them being the commercial products.
|
| P.S. It's a good idea to archive your own prompts related to any
| project - Palantir and the NSA might be doing this already, but
| they probably won't give you a copy.
| simonw wrote:
| This link should be updated to point to the article this is
| talking about: https://arstechnica.com/tech-
| policy/2025/06/openai-says-cour...
| neilv wrote:
| Probably. Though it bears mention that Lauren Weinstein is one
| of the OG Internet privacy people, so not the worst tweet
| (toot) to link to.
|
| (Even has an OG motorcycle avatar, ha.)
| lxgr wrote:
| That Mastodon instance seems to currently be hugged to death,
| though, so I appreciate the context.
| archsurface wrote:
| As it's a single sentence I'd suggest it probably is the
| worst link.
| baby_souffle wrote:
| > As it's a single sentence I'd suggest it probably is the
| worst link.
|
| At least it wasn't a link to a screenshot.
| refulgentis wrote:
| Generally I'd prefer sourced links that allow me to
| understand, even over a sentence from someone I like. Tell me
| more about the motorcycle avatars? :)
| EasyMark wrote:
| It's pointless without more details, article, or pointing at
| court decision. I'm not sure why a prominent person wouldn't
| do that
| Kiro wrote:
| Not a good look for her. Just another hateful and toxic
| thread on that horrible platform, riddled with off-topic
| accusations and conspiracy theories. They are making it sound
| like OpenAI is behind the court order or something. It's also
| super slow to load.
| yard2010 wrote:
| Twitter is making monsters out of regular people. I would
| say enshitified, but that's no shit, that's cancer.
| wonderwonder wrote:
| This is insanity. Because one organization is suing another,
| citizens right to privacy is thrown right out the window?
| tantalor wrote:
| You don't have the right not to be logged
| TOMDM wrote:
| When a company makes an obligation to the user via policy to
| them, the court forcing the company to violate the obligation
| they've made to the user is violating an agreement the user
| entered into.
| JumpCrisscross wrote:
| > _When a company makes an obligation to the user via
| policy to them, the court forcing the company to violate
| the obligation they 've made_
|
| To my knowledge, the court is forcing the company to change
| its policy. The obligation isn't broken, its terms were
| just changed on a going-forward basis. (Would be different
| if the court required preserving records predating the
| order.)
| bdangubic wrote:
| you use internet and expect privacy? I have Enron stock option
| to sell you...
| agnishom wrote:
| There is no need to be snarky. Just because the present
| internet is not great at privacy doesn't mean we can't hope
| for a future internet which is better at privacy.
| JKCalhoun wrote:
| The only hope I see is local LLMs, or Apple eventually
| doing something with encryption in the Secure Enclave.
| bdangubic wrote:
| local - 100%
|
| apple I trust as much as I trust politicians
|
| _sent from my iphone_ :)
| bdangubic wrote:
| if the topic of conversation was whether or not we "hope
| for better future" I'd be all in. saying that today your
| "rights to privacy are being thrown out window" deserves a
| snarky remark :)
| nearlyepic wrote:
| You thought they weren't logging these before? I have a bridge
| to sell you.
| klabb3 wrote:
| I have no idea why you're downvoted. Why on earth would they
| delete their most valuable competitive advantage? Isn't it
| even in the fine print that you feed them training data by
| using their product, which at the very minimum is logged?
|
| I thought the entire game these guys are playing is rushing
| to market to collect more data to diversify their supply
| chain from the stolen data they've used to train their
| current model. Sure, certain enterprise use cases might have
| different legal requirements, but certainly the core product
| and the average "import openai"-enjoyer.
| pritambarhate wrote:
| > Why on earth would they delete their most valuable
| competitive advantage?
|
| Becuase they are bound by their terms of service? Because
| if they won't no business would ever use their service and
| without businesses using their service they won't have any
| revenue?
| dahdum wrote:
| Insane that NYT is driving this privacy nightmare.
| visarga wrote:
| And they are doing this over literally "old news". Expired
| for years, of no value.
| TOMDM wrote:
| Does this effect ChatGPT API usage via Azure?
| casualscience wrote:
| probably not? MS deploys those models themselves, they don't go
| to OAI at all
| paxys wrote:
| MS is fighting several of the same copyright lawsuits
| themselves. Who says they won't be (or already are) subject
| to the same holds?
| TZubiri wrote:
| Lg2m
| api wrote:
| I always assume that anything I send unencrypted through any
| cloud service is archived for eternity and is not private.
|
| Not your computer, or not your encryption keys, not your data.
| HPsquared wrote:
| Even "your" computer is not your own. It's effectively
| controlled by Intel, Microsoft, Apple etc. They just choose not
| to use that power (as far as we know). Ownership and control
| are not the same thing.
| api wrote:
| It's a question of degree. The cloud is at one extreme end.
| An air gapped system running only open source you have
| audited is at the other extreme end.
| gngoo wrote:
| What's the big deal here? Doesn't every other app keep logs? I
| was already expecting they did. Don't understand the outrage
| here.
| MeIam wrote:
| No, apps can be prevented access. People can be disclosing
| private information.
| gngoo wrote:
| Every other app on the planet that does not explicitly claim
| to be E2E encrypted is likely keeping your "private
| information" readily accessible in some way.
| attila-lendvai wrote:
| in this day and age why would anyone assume that they were not
| retained from the beginning?
| kouru225 wrote:
| Ngl I assumed they were doing this to begin with
| MeIam wrote:
| So in effect Times has the right to see user's data then.. How do
| they have the right to take a look at users data?
| shadowgovt wrote:
| The Courts have broad leeway over document retention in a legal
| proceeding. The fact the documents are bring retained doesn't
| immediately imply plaintiffs get to see all of them.
|
| There are myriad ways courts balance privacy and legal-interest
| concerns.
|
| (The Times et al are alleging that OpenAI is aiding copyright
| violation by letting people get the text of news stories from
| the AI).
| MeIam wrote:
| If people can get the text itself from AI, then anyone can
| then why would it need access to other people's data?
|
| Does the Times believe that other people can get this text
| while it can't get it itself? To prove that the AI is
| stealing the info, Times does not need access to people's
| logs. All it has to show is that it can get that text.
|
| This sounds like Citizen United again to AstroTurf and gets
| access to logs with a fake cause.
| shadowgovt wrote:
| It's not whether people can get the data. They need to
| prove people are getting the data.
| MeIam wrote:
| So in effect if a lot of people don't get the data now,
| then it will never matter, is that right?
|
| That logic makes no sense because if they don't get it
| right now then it does not mean that they will not get it
| in future.
|
| If Times and its staff can get the text, is all that
| matters because the use and rate of data usage is not
| material as it can change any time in future.
| shadowgovt wrote:
| Court cases aren't generally about hypothetical futures.
| There is a specific claim of harm and the plaintiff has a
| legal right to the evidence needed to prove the harm if
| there's reasonable suspicion it exists.
|
| Capone isn't allowed to burn his protection racket
| documents claiming he's protecting the privacy of the
| business owners who paid protection money. The _Court_
| can take steps to protect their privacy (including
| swearing the plaintiff to secrecy on information learned
| immaterial to the case, or pre-filtering the raw data via
| a party trusted by the Court).
| WillPostForFood wrote:
| What is the judge even thinking here, it is so dumb.
|
| _She asked OpenAI 's legal team to consider a ChatGPT user who
| "found some way to get around the pay wall" and "was getting The
| New York Times content somehow as the output." If that user "then
| hears about this case and says, 'Oh, whoa, you know I'm going to
| ask them to delete all of my searches and not retain any of my
| searches going forward,'" the judge asked, wouldn't that be
| "directly the problem" that the order would address?_
| cheschire wrote:
| It's not dumb, litigation holds are a standard practice.
|
| https://en.wikipedia.org/wiki/Legal_hold
| m3kw9 wrote:
| But you are holding it incase there is litigation
| quotemstr wrote:
| How often do litigation holds apply to an entire business? I
| mean, would it be reasonable to ask Mastercard to
| indefinitely retain records of the most trivial transactions?
| dwattttt wrote:
| If you had a case that implicated every transaction
| Mastercard was making? Unless you needed every single one,
| I'm sure an order would be limited to whatever transactions
| are potentially relevant.
|
| Mastercard wouldn't get away with saying "it would be too
| hard to preserve evidence of our wrongdoing, so we're
| making sure it's all deleted".
| SpicyLemonZest wrote:
| The whole controversy here is that the order OpenAI
| received is _not_ limited to whatever chats are
| potentially relevant.
| asadotzler wrote:
| The order isn't about handing anything over. It says
| "don't delete anything until we've sorted out what you
| will be required to hand over later. We don't trust you
| enough in the mean time not to delete stuff that would
| later be found relevant so no deleting at all for now."
| m3kw9 wrote:
| Yes is like mandating back door to encryption to solve crimes.
| Wouldn't that solve that problem?! Dumb as a door stop
| amanaplanacanal wrote:
| If you are party to a lawsuit, the judge is going to require
| that you preserve relevant evidence. There is nothing unusual
| about this order.
| HillRat wrote:
| She's a magistrate judge, she's handling discovery matters, not
| the substantive issues at trial; the plaintiffs are
| specifically alleging spoliation by OpenAI/Microsoft (the
| parties were previously ordered to work out discovery issues,
| which obviously didn't happen) and the judge is basically
| ensuring that potentially-discoverable information is retained,
| though it may not actually be discoverable in practice (or
| require a special master). It's a wide-ranging order, but in
| this case that's probably indicative of the judge believing
| that the defendants have been acting in bad faith, particularly
| since she specifically asked them for an amelioration plan
| which they appear to have refused to provide.
| Kim_Bruning wrote:
| This appears to have immediate GDPR implications.
| solomatov wrote:
| Not a lawyer, but my understanding it's not since legal
| obligations is a reason for processing personal data.
| Kim_Bruning wrote:
| It's a bit more complicated. For the purposes of the GDPR
| legal obligations within the EU (where we might assume
| relevant protections are in place) might be considered
| differently than eg legal obligations towards the Chinese
| communist party, or the NSA.
| anticensor wrote:
| That excuse in EU holds only against an EU court or ICJ or
| ICC. EU doesn't recognise legal holds of foreign
| jurisdictions.
| solomatov wrote:
| Do you have any references to share?
| solfox wrote:
| > People on both platforms recommended using alternative tools to
| avoid privacy concerns, like Mistral AI or Google Gemini,
|
| Presumably, this same ruling will come for all AI systems soon;
| Gemini, Grok, etc.
| spjt wrote:
| It won't be coming for local inference.
| blibble wrote:
| they'll just outlaw that entirely
| ivape wrote:
| In some countries I don't see that as unlikely. Think about
| it, it's such a convenient way to criminalize anyone for an
| arbitrary reason.
| YetAnotherNick wrote:
| If in any case they require logging for all LLM calls, then
| by extension local non logged LLMs would be outlawed sooner
| or later.
| hsbauauvhabzb wrote:
| Are they all not collecting logs?
| JKCalhoun wrote:
| It would probably surprise no one if we find out, some time
| from now, tacit agreements to do so were already made (are
| being made) behind closed doors. "We'll give you what you want,
| just please don't call us out publicly."
| acheron wrote:
| "use a Google product to avoid privacy concerns" is risible.
| shadowgovt wrote:
| Google has the calibre of lawyers to make this hard for news
| companies to pull off.
| tonyhart7 wrote:
| wait they didn't do that before???
| b212 wrote:
| Im sure they pretended they did not.
|
| Now they can't pretend anymore.
|
| Although keeping deleted chats is evil.
| ronsor wrote:
| This court order certainly violates privacy laws in multiple
| jurisdictions and existing contracts OpenAI may have with
| customers.
| CryptoBanker wrote:
| Existing contracts have zero bearing on what a court may and
| may not order.
| ronsor wrote:
| Contracts don't, but foreign law is going to make this a pain
| for OpenAI. Other countries may not care what a U.S. court
| orders; they want their privacy laws followed.
| mosdl wrote:
| That's OpenAI's issue, not the court.
| jillesvangurp wrote:
| This is why American cloud providers have legal entities
| outside of the US. Those have to comply with the law in the
| countries where they are based if they want to do business
| there. That's how AWS, Azure, GCP, etc. can do business in
| the EU. Most of that business is neatly partitioned from
| any exposure to US courts. There are some treaties that
| govern what these companies can and cannot send back to the
| US that some might take issue with and that are policed and
| scrutinized quite a bit on the EU side.
|
| OpenAI does this as well of course. Any EU customers are
| going to insist on paying via an EU based entity in euros
| and will be talking to EU hosted LLMs with all data and
| logs being treated under EU law, not US law. This is not
| really optional for commercial use of SAAS services in the
| EU. To get lucrative enterprise contracts outside the US,
| OpenAI has no other choice but to adapt to this. If they
| don't, somebody else will and win those contracts.
|
| I actually was at a defense conference in Bonn last week
| talking to a representative of Google Cloud. I was
| surprised that they were there at all because the Germans
| are understandably a bit paranoid about trusting US
| companies with hosting confidential stuff (considering some
| scandals a few years ago about the CIA spying on the German
| government a few years ago). But they actually do offer
| some services to the BWI, which is the part of the German
| army that takes care of their IT needs. And German spending
| on defense is of course very high right now so there are a
| lot of companies trying to sell in Germany, on Germany's
| terms. Including Google.
| adriand wrote:
| The order also dates back to May 13. What the fuck?! That's
| weeks ago! The only reason I can think of for why OpenAI did
| not warn its users about this via an email notification is
| because it's bad for their business. But wow is it ever a
| breach of trust not to.
| jcranmer wrote:
| I don't think the order creates any new violations of privacy
| law. OpenAI's ability to retain the data and give it to third
| parties would have been the violation in the first place.
| JKCalhoun wrote:
| Not a lawyer -- but what ever happened to "fuck off, see you in
| court"?
|
| Did they already go that route and lose -- or is this an example
| of caving early?
| hsbauauvhabzb wrote:
| 'We want you to collect user data for 'national security'
| purposes. If you try and litigate, we will add so much red tape
| you'll be mummified alive'
| zomiaen wrote:
| This makes a whole lot more sense than the argument that
| OpenAI needs to store every single chat because a few people
| might be bypassing NYT's paywall with it.
| rangestransform wrote:
| The spaghetti framework of laws and discretionary enforcement
| is so incredibly dangerous to free speech, such as when the
| government started making demands of facebook to censor
| content during the pandemic. The government shouldn't be able
| to so much as breathe on any person or company for speech.
| wrs wrote:
| They _are_ in court.
| lexandstuff wrote:
| This is a court order. They saw them in court, and this was the
| result: https://cdn.arstechnica.net/wp-
| content/uploads/2025/06/NYT-v...
| tsunamifury wrote:
| These orders are in place for almost every form of communication
| already today, even from the companies that claim otherwise.
|
| And yes, I know, I worked on the only Android/iMessage crossover
| project to exist, and it was clear they had multiple breaches
| even just in delivery as well as the well known iCloud on means
| all privacy is void issue.
| paxys wrote:
| Not only does this mean OpenAI will have to retain this data on
| their servers, they could also be ordered to share it with the
| legal teams of companies they have been sued by during discovery
| (which is the entire point of a legal hold). Some law firm
| representing NYT could soon be reading out your private
| conversations with ChatGPT in a courtroom to prove their case.
| fhub wrote:
| My guess is they will store them on tape e.g. on something like
| Spectra TFinity ExaScale library. I assume AWS glacier et al
| use this sort of thing for their deep archives.
|
| Storing them on something that has hours to days retrieval
| window satisfies the court order, is cheaper, and makes me as a
| customer that little bit more content with it (mass data breach
| would take months of plundering and easily detectable).
| genewitch wrote:
| Glacier is tape silos, but this is textual data. You don't
| need to save output images, just the checkpoint+hash of the
| generating model and the seed. Stable diffusion saves this
| until you manually delete the metadata, for example. So my
| argument is you could do this with LTO as well. Text
| compresses well, especially if you don't do it naively.
| JKCalhoun wrote:
| > She suggested that OpenAI could have taken steps to anonymize
| the chat logs but chose not to
|
| That is probably the solution right there.
| paxys wrote:
| > She suggested that OpenAI could have taken steps to
| anonymize the chat logs but chose not to, only making an
| argument for why it "would not" be able to segregate data,
| rather than explaining why it "can't."
|
| Sounds like bullshit lawyer speak. What exactly is the
| difference between the two?
| dijksterhuis wrote:
| Not wanting to do something isn't the same thing as being
| unable to do something.
|
| !define would
|
| > Used to express desire or intent --
| https://www.wordnik.com/words/would
|
| !define cannot
|
| > Can not ( = am/is/are unable to) --
| https://www.wordnik.com/words/cannot
| paxys wrote:
| Who said anything about not wanting to?
|
| "I will not be able to do this"
|
| "I cannot do this"
|
| There is no semantic or legal difference between the two,
| especially when coming from a tech company. Stalling and
| wordplay is a very common legal tactic when the side has
| no other argument.
| dijksterhuis wrote:
| The article is derived from the order, which is itself a
| short summary of conversations had in court.
|
| https://cdn.arstechnica.net/wp-
| content/uploads/2025/06/NYT-v...
|
| > I asked:
|
| > > Is there a way to segregate the data for the users
| that have expressly asked for their chat logs to be
| deleted, or is there a way to anonymize in such a way
| that their privacy concerns are addressed... what's the
| legal issue here about why you can't, as opposed to why
| you would not?
|
| > OpenAI expressed a reluctance for a "carte blanche,
| preserve everything request," and raised not only user
| preferences and requests, but also "numerous privacy laws
| and regulations throughout the country and the world that
| also contemplate these type of deletion requests or that
| users have these types of abilities."
|
| A "reluctance to retain data" is not the same as
| "technically or physically unable to retain data". Judge
| decided OpenAI not wanting to do it was less important
| than evidence being deleted.
| lanyard-textile wrote:
| Disagree. There's something about the "able" that implies
| a hindered routine ability to do something -- you can
| otherwise do this, but something renders you unable.
|
| "I won't be able to make the 5:00 dinner." -> You could
| normally come, but there's another obligation. There's an
| implication that if the circumstances were different, you
| might be able to come.
|
| "I cannot make the 5:00 dinner." -> You could not
| normally come. There's a rigid reason for the
| circumstance, and there is no negotiating it.
| jjk166 wrote:
| If someone was in an accident that rendered them unable
| to walk, would you say they can or can not walk?
| lanyard-textile wrote:
| Yes? :) Being unable to walk is typically non negotiable.
| blagie wrote:
| This data cannot be anonymized. This is trivial provable,
| both mathematically, but given the type of data, it should
| also be intuitively obvious to even the most casual observer.
|
| If you're talking to ChatGPT about being hunted by a Mexican
| cartel, and having escaped to your Uncle's vacation home in
| Maine -- which is the sort of thing a tiny (but non-zero)
| minority of people ask LLMs about -- that's 100% identifying.
|
| And if the Mexican cartel finds out, e.g. because NY Times
| had a digital compromise at their law firm, that means
| someone is dead.
|
| Legally, I think NY Times is 100% right in this lawsuit
| holistically, but this is a move which may -- quite literally
| -- kill people.
| zarzavat wrote:
| It's like anonymizing your diary by erasing your name on
| the cover.
| JKCalhoun wrote:
| I don't dispute your example, but I suspect there is a non-
| zero number of cases that would not be so extreme, so
| obviously identifiable.
|
| So, sure, no panacea, but .. why not for the cases where it
| would be a barrier?
| genewitch wrote:
| AOL found out and thus we all found out that you can't
| anonymize certain things, web searches in that case. I used
| to have bookmarked some literature from maybe ten years ago
| that said,(proved with math?), any moderate collection of
| data from or by individuals that fits certain criteria is de-
| anonymizeable, if not by itself, then with minimal extra
| data. I want to say it included if, for instance, instead of
| changing all occurances of genewitch to user9843711, every
| instance of genewitch was a different, unique id.
|
| I apologize for not having cites or a better memory at this
| time.
| catlifeonmars wrote:
| https://en.wikipedia.org/wiki/K-anonymity
| bilbo0s wrote:
| I'd just assume that any chat or api call you do to any cloud
| based ai in th US will be discoverable from here on out.
|
| If that's too big a risk it really is time to consider locally
| hosted LLMs.
| amanaplanacanal wrote:
| That's always been the case for any of your data anywhere in
| any third party service of any kind, if it is relevant
| evidence in a lawsuit. Nothing specific to do with LLMs.
| marcyb5st wrote:
| I ask again, why not anonymizing the data? That way NYT/the
| court could see if users are bypassing the paywall through
| ChatGPT while preserving privacy.
|
| Even if I wrote it, I don't care if someone read out loud in
| public court "user <insert_hash_here> said: <insert nastiest
| thing you can think of here>"
| Orygin wrote:
| You can't really anonymize the data if the conversation
| itself is full of PII.
|
| I had colleagues chat with GPT, and they send all kinds of
| identifying information to it.
| mastazi wrote:
| I'm seeing HN hug of death when attempting to open the link, but
| was able to read the post on Wayback Machine
| https://web.archive.org/web/20250604224036/https://mastodon....
|
| I think this is a private Mastodon instance on someone's personal
| website so it makes sense that it might have been overwhelmed by
| the traffic.
| OJFord wrote:
| Better link in the thread: https://arstechnica.com/tech-
| policy/2025/06/openai-says-cour...
|
| (As in, an actual article, not just a mastodon-tweet from some
| unknown (maybe known? Not by me) person making the title claim,
| with no more info.)
| incompatible wrote:
| Looks like https://en.wikipedia.org/wiki/Lauren_Weinstein_(tech
| nologist..., he has been commentating on the Internet for about
| as long as it has existed.
| bibinou wrote:
| And? the article you linked only has primary sources.
| genewitch wrote:
| Roughly how many posts on HN are by people you know?
| OJFord wrote:
| Of those that are tweets and similar? Almost all of them (the
| ones I look at being interested in the topic anyway).
|
| By 'know' I mean recognise the name as some sort of
| authority. I don't 'know' Jon Gruber or Sam Altman or Matt
| Levine, but I'll recognise them and understand why we're
| discussing their tweet.
|
| The linked tweet (whatever it's called) didn't say anything
| more than the title did here, so it was pointless to click
| through really. In replies someone asked the source and
| someone else replied with the link I commented above. (I
| don't 'know' those people either, but I recognise Ars/even if
| I didn't appreciate the longer form with more info.)
| genewitch wrote:
| thanks for engaging.
|
| > The linked tweet (whatever it's called)
|
| "post" works for social media regardless of the medium; not
| an admonishment, an observation. Also, by the time i saw
| this, it was already an Ars link, leaving some comments
| with less context that i apparently didn't pick up on. I
| was able to make my observation because someone mentioned
| mastodon (i think), but that was an assumption on my part
| that the original link was mastodon.
|
| So i asked the question to make sure it wasn't some bias
| against mastodon (or the fediverse), because I'd have liked
| to ask, "for what reason?"
| OJFord wrote:
| > > The linked tweet (whatever it's called)
|
| > "post" works for social media regardless of the medium;
| not an admonishment, an observation.
|
| It also works for professional journalism and blog-err-
| _posts_ though, the distinction from which was my point.
|
| > I was able to make my observation because someone
| mentioned mastodon (i think), but that was an assumption
| on my part that the original link was mastodon.
|
| As for assuming/'someone' mentioning Mastodon, my own
| comment you initially replied to ended:
|
| > (As in, an actual article, not just a mastodon-tweet
| from some unknown (maybe known? Not by me) person making
| the title claim, with no more info.)
|
| Which was even the bit ('unknown') you objected to.
| JKCalhoun wrote:
| > But now, OpenAI has been forced to preserve chat history even
| when users "elect to not retain particular conversations by
| manually deleting specific conversations or by starting a
| 'Temporary Chat,' which disappears once closed," OpenAI said.
|
| So, why is Safari not forced to save my web browsing history too
| (even of I delete it)? Why not also the "private" tabs I open?
|
| Just OpenAI, huh?
| wiradikusuma wrote:
| Because it's running on your computer?
| gpm wrote:
| First, there's no court order for Safari. This isn't the court
| saying "everyone always has to preserve data" it's a court
| saying "in the interest of this litigation this specific party
| has to preserve data for now".
|
| But moreover, Safari isn't a third party, it's a tool you are
| using whose data is in your possession. That means that in the
| US things like fourth amendment rights are _much_ stronger. A
| blanket order requiring that Safari preserve everyone 's
| browsing history would be an illegal general warrant (in the
| US).
| amanaplanacanal wrote:
| It's evidence in an ongoing lawsuit.
| yieldcrv wrote:
| > Before the order was in place mid-May, OpenAI only retained
| "chat history" for users of ChatGPT Free, Plus, and Pro who did
| not opt out of data retention
|
| > opt out
|
| alright, sympathy lost
| Imnimo wrote:
| So if you're a business that sends sensitive data through ChatGPT
| via the API and were relying on the representation that API
| inputs and outputs were not retained, OpenAI will just flip a
| switch to start retaining your data? Were notifications sent out,
| or did other companies just have to learn about this from the
| press?
| thuanao wrote:
| As if we needed another reason to hate NYT and their paywall...
| AlienRobot wrote:
| I'm usually against LLM's massive breach of copyright, but this
| argument is just weird.
|
| >At a conference in January, Wang raised a hypothetical in line
| with her thinking on the subsequent order. She asked OpenAI's
| legal team to consider a ChatGPT user who "found some way to get
| around the pay wall" and "was getting The New York Times content
| somehow as the output." If that user "then hears about this case
| and says, 'Oh, whoa, you know I'm going to ask them to delete all
| of my searches and not retain any of my searches going forward,'"
| the judge asked, wouldn't that be "directly the problem" that the
| order would address?
|
| If the user hears about this case, and now this order, wouldn't
| they just avoid doing that for the duration of the court order?
| junon wrote:
| Side note, why is almost every comment that contains the word
| "shill" so pompous and aggressive?
| johnnyanmac wrote:
| Shill in general has a strong connotation. It comes with the
| idea of someone who would use the word so freely that it'll
| naturally be aggressive.
|
| I don't know anyone's agenda in terms of commenters, so they'd
| have to be very blatant for me to use such a word.
| celnardur wrote:
| There has been a lot of opinion pieces popping up on HN recently
| that describe the benefits they see from LLMs and rebut the
| drawbacks most of them talk about. While they do bring up
| interesting points, NONE of them have even mentioned the privacy
| aspect.
|
| This is the main reason I can't use any LLM agents or post any
| portion of my code into a prompt window at work. We have NDAs and
| government regulations (like ITAR) we'd be breaking if any code
| left our servers.
|
| This just proves the point. Until these tools are local, privacy
| will be an Achilles heal for LLMs.
| garyfirestorm wrote:
| You can always self host an LLM which is completely controlled
| on your own server. This is trivial to do.
| celnardur wrote:
| Yes, but which of the state of the art models that offer the
| best results, are you allowed to do this with? As far as I've
| seen the models that you can host locally are not the ones
| being praised left and right in these articles. My company
| actually allows people to use a hosted version of Microsoft
| copilot, but most people don't because it's still not that
| much of a productivity boost (if any).
| genewitch wrote:
| Deepseek isn't good enough? You need a beefy GPU cluster
| but I bet it would be fine until the large llama is better
| at coding, and I'm certain there will be other large models
| for LLM. Now if there's some new technology around the
| corner, someone might be able to build a moat, but in a
| surprising twist, Facebook did us all a favor by releasing
| their weights back when; there's no moat possible, in my
| estimation, with LLMs as it stands today. Not even "multi-
| model" implementations. Which I have at home, too.
|
| Say oai implements something that makes their service 2x
| better. Just using it for a while should give people who
| live and breathe this stuff enough information to tease out
| how to implement something like it, and eventually it'll
| make it into the local-only applications, and models.
| anonymousDan wrote:
| Are there any resources on how much it costs to run the
| full deep seek? And how to do it?
| genewitch wrote:
| I can fill in anything missing, i would like to go to bed
| but i did't want to leave anyone hanging. had to come
| edit a comment i made from my phone, and my phone also
| doesn't show me replies (i use materialistic, is there a
| better app?)
|
| https://getdeploying.com/guides/run-deepseek-r1 this is
| the "how to do it"
|
| https://news.ycombinator.com/item?id=42897205 posted
| here, a link to how to set it up on an AMD Epyc machine,
| ~$2000. IIRC a few of the comments discuss how many GPUs
| you'd need (a lot of the 80GB GPUs, 12-16 i think), plus
| the mainboards and PSUs and things. however to _just run_
| the largest deepseek you merely need memory to hold the
| model and the context, plus ~10% and i forget why +10%
| but that 's my hedge to be more accurate.
|
| note: i have not checked if LM Studio can run the large
| deepseek model; i can't fathom a reason it couldn't, at
| least on the Epyc CPU only build.
|
| note too: I just asked in their discord and it appears
| "any GGUF model will load if you have the memory for it"
| - "GGUF" is like the format the model is in. Someone will
| take whatever format mistral or facebook or whoever
| publishes and convert it to GGUF format, and from there,
| someone will start to quantize the models into smaller
| files (with less ability) as GGUF.
| bogtog wrote:
| That's $2000 but for just 3.5-4.25 tokens/s? I'm hesitant
| to say that 4 tokens/s is useless, but that is a
| tremendous downgrade (although perhaps some smaller model
| would be usable)
| redundantly wrote:
| Trivial after a substantial hardware investment and
| installation, configuration, testing, benchmarking, tweaking,
| hardening, benchmarking again, new models come out so more
| tweaking and benchmarking and tweaking again, all while
| slamming your head against the wall dealing with the mediocre
| documentation surrounding all hardware and software
| components you're trying to deploy.
|
| Yup. Trivial.
| blastro wrote:
| This hasn't been my experience. Pretty easy with AWS
| Bedrock
| paxys wrote:
| Ah yes, "self host" by using a fully Amazon-managed
| service on Amazon's servers. How would a US court ever
| access those logs?
| garyfirestorm wrote:
| Run a vllm docker container. Yeah the assumption is you
| already know what hardware you need or you already have
| it on prem. Assuming this is ITAR stuff, you must be self
| hosting everything.
| dvt wrote:
| Even my 4-year-old M1 Pro can run a quantized Deepseek R1
| pretty well. Sure, full-scale productizing these models is
| hard work (and the average "just-make-shovels" startups are
| failing hard at this), but we'll 100% get there in the next
| 1-2 years.
| whatevaa wrote:
| Those small models suck. You need the big guns to get
| those "amazing" coding agents.
| bravesoul2 wrote:
| Local for emotional therapy. Big guns to generate code.
| Local to edit generated code once it is degooped and
| worth something.
| benoau wrote:
| I put it LM Studio on an old gaming rig with a 3060 TI,
| took about 10 minutes to start using it and most of that
| time was downloading a model.
| jjmarr wrote:
| If you're dealing with ITAR compliance you should have
| experience with hosting things on-premises.
| genewitch wrote:
| I'm for hire, I'll do all that for any company that needs
| it. Email in profile. Contract or employee, makes no
| difference to me.
| dlivingston wrote:
| Yes. The past two companies I've been at have self-hosted
| enterprise LLMs running on their own servers and connected
| to internal documentation. There is also Azure Cloud for
| Gov and other similar privacy-first ways of doing this.
|
| But also, running LLMs locally is easy. I don't know what
| goes into _hosting_ them, as a service for your org, but
| just getting an LLM running locally is a straightforward
| 30-minute task.
| aydyn wrote:
| It is not at all trivial for an organization that may be
| doing everything on the cloud to locally set up the necessary
| hardware and ensure proper networking and security to that
| LLM running on said hardware.
| woodrowbarlow wrote:
| > NONE of them have even mentioned the privacy aspect
|
| because the privacy aspect has nothing to do with LLMs and
| everything to do with relying on cloud providers. HN users have
| been vocal about that since long before LLMs existed.
| ljm wrote:
| Where is the source? OP goes to a mastodon instance that can't
| handle the traffic.
| ETH_start wrote:
| Two things. First, the judge could have issued a narrowly
| tailored order -- say, requiring OpenAI to preserve only those
| chats that a filter flags as containing substantial amounts of
| paywalled content from the plaintiffs. That would've targeted the
| alleged harm without jeopardizing the safety of massive amounts
| of unrelated user data.
|
| Second, we're going to need technology that can simply defy
| government orders, as digital technology expands the ability of
| one government order violating rights at scale. Otherwise, one
| judge -- whether in the U.S., China, or India -- can impose a
| sweeping decision that undermines the privacy and autonomy of
| billions.
| DevX101 wrote:
| There were some enterprises that refused to send any data to
| OpenAI, despite assurances that that data would not be logged.
| Looks like they've been vindicated in keeping everything on prem
| via self-hosted LLM models.
| lxgr wrote:
| > OpenAI is NOW DIRECTED to preserve and segregate all output log
| data that would otherwise be deleted on a going forward basis
| until further order of the Court (in essence, the output log data
| that OpenAI has been destroying), whether such data might be
| deleted at a user's request or because of "numerous privacy laws
| and regulations" that might require OpenAI to do so.
|
| Spicy. European courts and governments will love to see their
| laws and legal opinions being shrugged away in ironic quotes.
| reassess_blind wrote:
| Will the EU respond by blocking them from doing business in the
| EU, given they're not abiding by GDPR?
| echelon wrote:
| Hopefully.
|
| We need many strong AI players. This would be a great way to
| ensure Europe can grow its own.
| ronsor wrote:
| > This would be a great way to ensure Europe can grow its
| own.
|
| The reason this doesn't happen is because of Europe's
| internal issues, not because of foreign competition.
| lxgr wrote:
| Arguably, until recently there just wasn't any reason to:
| Europeans were happy buying American software and online
| services; Americans were happy buying German cars and
| pharmaceuticals.
| glookler wrote:
| Personally, I don't think the US clouds win anything on
| merit.
|
| It's hard/pointless to motivate engineers to use other
| options and their significance doesn't grow since
| Engineers won't blog that much about them to show their
| expertise, etc. Certification and experience with a
| provider with 10%-80% market share is a future employment
| reason to put up with a lot of trash, and the amount of
| help to work around that trash that has made it into
| places like ChatGPT is mindboggling.
| a2128 wrote:
| It would be a political catastrophe right now if the EU
| blocked US companies due to them needing to comply with
| temporary US court orders. My guess is this'll be swept under
| the rug and permitted under the basis of a legal obligation
| selcuka wrote:
| What about the other way around? Why don't we see a US
| court order that is in conflict with EU privacy laws as a
| political catastrophe, too?
| philipov wrote:
| Because courts are (wrongly) viewed as not being
| political, and public opinion hasn't caught up with
| reality yet.
| ethbr1 wrote:
| The court system as a whole is more beholden to laws as
| written than politics.
|
| And that's a key institution in a democracy, given the
| frequency with which either the executive or legislative
| branches try to do illegal things (defined by
| constitutions and/or previously passed laws).
| philipov wrote:
| Yes, courts ought to be apolitical. Just that recently,
| especially the supreme court has not been meeting that
| expectation.
| StanislavPetrov wrote:
| Courts have always been political, which is why
| "jurisdiction shopping" has been a thing for decades. The
| Supreme Court, especially, has always been political,
| which is why one of the biggest issues in political
| campaigns is who is going to be able to nominate new
| justices. Most people of all political persuasions view
| courts as apolitical when those courts issue rulings in
| that affirm their beliefs, and political when they rule
| against them.
|
| You're right though, in a perfect world courts would be
| apolitical.
| intended wrote:
| The American Supreme Court could have been balanced
| though. Sadly, one team plays to win, the other team
| wants to be in a democracy. The issue is not the politics
| of the court, but the enforced Partisanship which took
| hold of the Republican Party post watergate.
|
| All systems can be bent, broken, or subverted. Still, we
| need to make systems which do the best within the bounds
| of reality.
| StanislavPetrov wrote:
| >Sadly, one team plays to win, the other team wants to be
| in a democracy.
|
| As a lifelong independent, I can tell you that this sort
| of thinking is incredibly prevalent and also incredibly
| wrong. Even a casual look at recent history proves this.
| How do you define "democracy"? Most of us define it as
| "the will of the people". Just recently, however, when
| "the will of the people" has not been the will of the
| ruling class, the "will of the people" has been decried
| as dangerous populism (nothing new but something that has
| re-emerged recently in the so-called Western World). It
| is our "institutions" they argue, that are actually
| democracy, and not the will of the foolish people who are
| ignorant and easily swayed.
|
| >All systems can be bent, broken, or subverted.
|
| Very true, and the history of our nation is proof of
| that, from the founding right up to the present day.
|
| >Still, we need to make systems which do the best within
| the bounds of reality.
|
| It would be nice, but that is a long way from how things
| are, or have ever been (so far).
| collingreen wrote:
| My impression was that American democracy is supposed to
| "derive its power from those being governed" (as opposed
| to being given power by God) and pretty explicitly was
| designed to actively prevent "the tyranny of the
| majority", not enable it.
|
| I think it's a misreading to say the government should do
| whatever the whim of the most vocal, gerrymandered
| jurisdictions are. Instead, it is a supposed to be a
| republic with educated, ethical professionals doing the
| lawmaking within a very rigid structure designed to limit
| power severely in order to protect individual liberty.
|
| For me, the amount of outright lying, propaganda, blatant
| corruption, and voter abuse makes a claim like "democracy
| is the will of the most people who agree" seem misguided
| at best (and maybe actively deceitful).
|
| Re reading your comment, the straw man about "democracy
| is actually the institutions" makes me think I may have
| fallen for a troll so I'm just going to stop here.
| StanislavPetrov wrote:
| >Re reading your comment, the straw man about "democracy
| is actually the institutions" makes me think I may have
| fallen for a troll so I'm just going to stop here.
|
| You haven't, so be assured.
|
| >I think it's a misreading to say the government should
| do whatever the whim of the most vocal, gerrymandered
| jurisdictions are.
|
| It shouldn't, and I didn't argue that. My argument is
| that the people in charge have completely disregarded the
| will of the people en mass for a long time, and that the
| people are so outraged and desperate that at this point
| they are willing to vote for anyone who will upend the
| elite consensus that refuses to change.
|
| >Instead, it is a supposed to be a republic with
| educated, ethical professionals doing the lawmaking
| within a very rigid structure designed to limit power
| severely in order to protect individual liberty.
|
| How is that working out for us? Snowden's revelations
| were in 2013. An infinite number of blatantly illegal and
| unconstitutional programs actively being carried out by
| various government agencies. Who was held to account?
| Nobody. What was changed? Nothing. Who was in power? The
| supposedly "good" team that respects democracy. Go watch
| the conformation hearing of Tulsi Gabbard from this year.
| Watch Democratic Senator after Democratic Senator
| denounce Snowden as a traitor and repeatedly demand that
| she denounce him as well, as a litmus test for whether or
| not she could be confirmed as DNI (this is not a comment
| on Gabbard one way or another). My original comment
| disputed the contention that one party was for democracy
| and the other party was against it. Go watch that video
| and tell me that the Democrats support liberty, freedom,
| democracy and a transparent government. I don't support
| either of the parties, and this is one of the many
| reasons why.
| vanviegen wrote:
| > You're right though, in a perfect world courts would be
| apolitical.
|
| Most other western democracies are a lot closer to a
| perfect world, it seems.
| StanislavPetrov wrote:
| Germany, where they lock you up for criticizing
| politicians[1] or where they have a ban against
| protesting for Palestine because it's "antisemitic"?[2]
|
| Or UK where you can get locked up for blasphemy[3] or
| where they lock up ~30 people a day for saying offensive
| things online because of their Online Safety Act?[4]
|
| Or perhaps Romania where an election that didn't turn out
| the way the EU elites wanted is overturned based on
| nebulous (and later proven false) accusation that the
| election was somehow influenced by a TikTok campaign by
| the Russians that later turned out to have been funded by
| a Romanian opposition party.[5]
|
| I could go on and on, but unfortunately most other
| western democracies are just as flawed, if not worse.
| Hopefully we can all strive for a better future and flush
| the authoritarians, from all the parties.
|
| [1] https://www.youtube.com/watch?v=-bMzFDpfDwc
|
| [2] https://www.euronews.com/2023/10/19/mass-arrests-
| following-p...
|
| [3] https://news.sky.com/story/man-convicted-after-
| burning-koran...
|
| [4] https://www.thetimes.com/uk/crime/article/police-
| make-30-arr...
|
| [5] https://www.politico.eu/article/investigation-ties-
| romanian-...
| vanviegen wrote:
| I understand these are court decisions you don't agree
| with. (And neither do I for the most part, though I
| imagine some of these cases to have more depth to them.)
|
| But is there any reason to believe that judged were
| pressured/compelled by political powers to make these
| decisions? Apart from, of course, the law created by
| these politicians, which is how the system is intended to
| work.
| StanislavPetrov wrote:
| >But is there any reason to believe that judged were
| pressured/compelled by political powers to make these
| decisions?
|
| No, but I have every reason to believe that the judges
| who made these decisions were people selected by
| political powers so that they would make them.
|
| >Apart from, of course, the law created by these
| politicians, which is how the system is intended to work.
|
| But the system isn't working for the people, it is
| horribly broken. The people running the system are mostly
| corrupt and/or incompetent, which is why so many voters
| from a wide variety of countries, and across the
| political spectrum, are willing to vote for anyone (even
| people who are clearly less than ideal) that shits all
| over the system and promises to smash it. Because the
| system is currently working exactly how it's intended to
| work, most people hate it and nobody feels like they can
| do anything about it.
| bee_rider wrote:
| Even if the we imagined the courts as apolitical (and I
| agree with you, they actually _are_ political so
| imagining otherwise is silly), the question of how to
| react to court cases in _other countries_ is a matter of
| geopolitics and international relations.
|
| While folks believe all sorts of things, I don't think
| anyone is going to call international relations
| apolitical!
| Nasrudith wrote:
| International relations could fairly be called anarchic
| because they aren't bound by law and no entity is capable
| of enforcing them against nation states. Remember that
| whenever 'sovereignty' is held up as some sacred, shining
| ideal what they really mean is 'the ability to do
| whatever the hell they want without being held
| accountable'.
| MattGaiser wrote:
| EU has few customer facing tech companies of note.
| bee_rider wrote:
| We're doing our best to provide them an opening, though.
| lmm wrote:
| Because deep down Americans don't actually have any
| respect for other countries. This sounds like a flamebait
| answer, but it's the only model I can reconcile with
| experience.
| DocTomoe wrote:
| Without trying to become too political, but thanks to
| recent trade developments, right now the US is under
| special scrutiny to begin with, and goodwill towards US
| companies - or courts - has virtually evaporated.
|
| I can see that factoring in in a decision to penalise an US
| company when it breaks EU law, US court order or not.
| yatopifo wrote:
| Ultimately, American companies will be pushed out of the EU
| market. It's not going to happen overnight, but the outcome
| is unavoidable in light of the ongoing system collapse in
| the US.
| rafaelmn wrote:
| EU software scene would take a decade to catch up. Only
| alternative being if AI really delivers on being a force
| multiplier - but even then EU would not have access to
| SOTA internally.
| blagund wrote:
| What does the EU lack? Is it the big corp infra? Or
| something more fundamental?
| KoolKat23 wrote:
| Big corpo cash and big risk appetite.
| inglor_cz wrote:
| In my opinion, we lack two things:
|
| a) highly qualified people, even European natives move to
| Silicon Valley. There is a famous photo of the OpenAI
| core team with 6 Polish engineers and only 5 American
| ones;
|
| b) culture of calculated risk when it comes to
| investment. Here, bankruptcy is an albatross around your
| neck, both legally and culturally, and is considered a
| sign of you being fundamentally inept instead of maybe
| just a misalignment with the market or even bad luck.
| You'd better succeed on your first try, or your options
| for funding will evaporate.
| dukeyukey wrote:
| Worth pointing out that DeepMind was founded in London,
| the HQ is still here, and so is the founder and CEO. I've
| lived in North London for 8 years now, there are _loads_
| of current-and-former DeepMind AI people here. Now that
| OpenAI, Anthropic, and Mistral have offices here the
| talent density is just going up.
|
| On risk, we're hardly the Valley, but a failed startup
| isn't a black mark at all. It's a big plus in most tech
| circles.
| inglor_cz wrote:
| The UK is a bit different in this regard, same as
| Scandinavia.
|
| But in many continental countries, bankruptcy is a
| serious legal stigma. You will end up on public
| "insolvency lists" for years, which means that no bank
| will touch you with a 5 m pole and few people will even
| be willing to rent you or your new startup office space.
| You may even struggle to get banal contracts such as
| "five SIMs with data" from mobile phone operators.
|
| There seems to be an underlying assumption that people
| who go bankrupt are either fatally inept or fraudsters,
| and need to be kept apart from the "healthy" economy in
| order not to endanger it.
| ben_w wrote:
| Given what happened with DeepSeek, "not state of the art"
| can still be simultaneously really close to the top, very
| sudden, very cheap, and from one small private firm.
| rafaelmn wrote:
| Not really with the EU data sources disclosure mindset,
| GDPR and all that. China has a leg up in the data game
| because they care about copyright/privacy and IP even
| less than US companies. EU is supposedly booting US
| companies because of this.
| ben_w wrote:
| The data sources is kinda what this court case is about,
| and even here on HN a lot of people get very annoyed by
| the application of the "open source" label to model
| weights that _don 't_ have the source disclosure the EU
| calls for.
|
| GDPR is about personally identifiable humans. I'm not
| sure how critical that information really is to these
| models, though given the difficulty of deleting it from a
| trained model when found, yes I agree it poses a huge
| practical problem.
| romanovcode wrote:
| Why? I would see it 2 years ago but now every other
| platform has completely catched-up to ChatGPT. The LeChat
| or whatever the French alternative is just as good.
| ensignavenger wrote:
| Doesn't GDPR have an explicit exemption for legal compliance?
| lxgr wrote:
| Yes, but somehow I feel like "a foreign court told us to
| save absolutely everything" will not hold up in the EU
| indefinitely.
|
| At least in sensitive contexts (healthcare etc.) I could
| imagine this resulting in further restrictions, assuming
| the order is upheld even for European user's data.
| dijksterhuis wrote:
| GDPR allows for this as far as i can tell (IANAL)
|
| > Paragraphs 1 and 2 shall not apply to the extent that
| processing is necessary:
|
| > ...
|
| > for the establishment, exercise or defence of legal claims.
|
| https://gdpr-info.eu/art-17-gdpr/
| killerpopiller wrote:
| if you are the controller (or data subject). ChatGPT is the
| Processor. Otherwise EU controller processing PII in
| ChatGPT have a problem now.
| ndsipa_pomu wrote:
| 'Controller' means the natural or legal person, public
| authority, agency or other body which, alone or jointly
| with others, determines the purposes and means of the
| processing of personal data.
|
| 'Processor' means a natural or legal person, public
| authority, agency or other body which processes personal
| data on behalf of the controller.
| _Algernon_ wrote:
| Do legal claims outside eu jurisdiction apply? Seems like
| too big of a loophole to let any court globally sidestep
| GDPR.
| gaogao wrote:
| The order is a bit broad, but legal holds frequently interact
| with deletion commitments. In particular, the only purpose of
| data deleted under GDPR held up by a legal hold should be for
| that legal hold, so it would be a big no-no if OpenAI
| continued to use that data for training.
| voxic11 wrote:
| > The General Data Protection Regulation (GDPR) gives
| individuals the right to ask for their data to be deleted and
| organisations do have an obligation to do so, except in the
| following cases:
|
| > there is a legal obligation to keep that data;
|
| https://commission.europa.eu/law/law-topic/data-
| protection/r...
| tgsovlerkhgsel wrote:
| Except that transferring the data to the US is likely
| illegal in the first place, specifically due to
| insufficient legal protections against overreach like this.
| hulitu wrote:
| No. GDPR was never enforced, else Microsoft, Meta, Google and
| Apple couldn't do business in the EU.
| udev4096 wrote:
| EU privacy laws are nothing but theatre. How many times have
| they put up a law which would undermine end-to-end encryption
| or the recent law which "bans" anonymous crypto coins? It's
| quite clear they are very good at virtue signaling
| thrwheyho wrote:
| Not sure why are you down voted here. I'm in EU and
| everything I can say about GDPR is that it does not work
| (simple example, government itself publishes my data
| including National ID number on their portal with property
| data; anybody can check who my parents are or if I'm on
| mortgage). And there are more.
| shadowgovt wrote:
| Europe doesn't have jurisdiction over the US, and US courts
| have broad leeway in a situation like this.
| lxgr wrote:
| Sure, but the EU has jurisdiction over European companies and
| can prohibit them from storing or processing their data in
| jurisdictions incompatible with its data privacy laws.
|
| There's also an obvious compromise here - modify the US court
| ruling to exclude data of non-US users. Let's hope that cool
| heads prevail.
| lrvick wrote:
| And yet, iPhones are shipping usb C.
|
| Making separate manufacturing lines for Europe vs US is too
| expensive, so in effect, Europe forced a US company to be
| less shitty globally.
| VladVladikoff wrote:
| USB C is a huge downgrade from lightning. The lightning
| connector is far more robust. The little inner connector
| board on USB C is so fragile. I'll never understand why
| people wanted this so badly.
| Cloudef wrote:
| Literally every other device is usb-c, thats why
| ajr0 wrote:
| I'm a big fan of this change, however, I think to a black
| mirror episode [0] where essentially little robot dogs
| could interface with everything they came into contact
| with because every connection was the same, it may be
| trivial to have multiple connections for a weapon like
| this but the take away I had from this is that 'variety'
| may be better than a single standardized solution. Partly
| because it is more expensive to plan for multiple types
| of inputs and making the cost of war go up will make it
| more difficult which I think inherently is the idea
| behind some of the larger cybersecurity companies, a hack
| can only work once then everyone has defenses for it
| after that single successful attack, this makes it more
| expensive to successfully stage attacks. Huge digression
| from this convo... but I think back to this constantly.
|
| [0]
| https://en.wikipedia.org/wiki/Metalhead_(Black_Mirror)
| lxgr wrote:
| Proprietary ports are textbook security by obscurity.
| intended wrote:
| Many defensive are a trade off between the convenience of
| non attackers, and the trouble created for attackers.
|
| Given the sheer number of devices we interact with in a
| single day, USB-C as a standard is worth the trade off
| for an increase in our threat surface area.
|
| 1000 Attackers can carry around N extra charging wires
| anyway.
|
| 10^7 users having to keep say, 3 extra charging wires on
| average? That's a huge increase in costs and resources.
|
| (Numbers made up)
| bee_rider wrote:
| Two thoughts:
|
| 1) Surely the world conquering robo-army could get some
| adapters.
|
| 2) To the extend to which this makes anything more
| difficult, it is just that it makes everything a tiny bit
| less convenient. This includes the world-conquering robo-
| army, but also everything else we do. It is a general
| argument against capacity, which can't be right, right?
| lou1306 wrote:
| So I suppose Lightning's abysmally slow transfer speed is
| also a security feature? No way you can exfiltrate my
| photo roll at 60 MBps :)
| itake wrote:
| All of my devices are lightening. Now I have to carry
| around 2 cables.
| 0x073 wrote:
| What should I say with an iPod classic? 3!
| itake wrote:
| why do I need to replace my devices every year?
| broodbucket wrote:
| There's simply more people with the opposite problem,
| especially in markets where Apple is less prevalent,
| which is most of them around the world. When there's more
| than one type of cable, plenty of people are going to be
| inconvenienced when one is chosen as the cable to rule
| them all, but in the end everyone wins, it's just
| annoying to get there.
| jjk166 wrote:
| > plenty of people are going to be inconvenienced when
| one is chosen as the cable to rule them all, but in the
| end everyone wins
|
| That's not everyone wins. The people that actually bought
| these devices now have cables that don't work and need to
| replace with a lower quality product, and the people who
| were already using something else are continuing to not
| need cables for these devices. The majority breaks even,
| a significant minority loses.
|
| Simply not choosing one cable to rule them all lets
| everyone win. There is no compelling reason for one size
| to fit all.
| FeepingCreature wrote:
| It's a temporary drawback; everyone wins in the long term
| because there's only one standard.
| jjk166 wrote:
| Again, that's not a win for anybody. No one winds up in a
| better position than where they started, there is no
| payback in exchange for the temporary drawback, which
| also isn't temporary if the final standard is inferior.
|
| If some people like hip hop but more people like country,
| it's not a win for everybody to eliminate the hip hop
| radio stations so we can all listen to a single country
| station.
| inglor_cz wrote:
| This is closer to having a common railway gauge, though.
| itake wrote:
| Everyone, but the environment wins. Once I upgrade my
| phone and Airpods, I will have to throw out my pack of
| perfectly working lightning cables.
|
| I'm sure there are more than a few people that would end
| up throwing out their perfectly functional accessories,
| only for the convenience of carrying less cables.
| AStonesThrow wrote:
| Why don't you donate them to a thrift store or
| educational charity? Are there no non-profits who
| refurbish and reuse electronics in your community?
| itake wrote:
| I don't want to burn fuel trying to find a place to
| accept used 5 year old Airpod Pros with yellow earbud
| plastic.
|
| I don't want to ship another cable across the Pacific
| Ocean from China so I can have a cable that works on my
| devices.
|
| I want to keep using them until they don't work and I
| can't repair them any more.
| Larrikin wrote:
| All of mine are USB C and now I only carry around one.
| All of the lightning cords and micro USB cables are in a
| drawer somewhere with the DVI, component cables, etc.
| itake wrote:
| neat. I get to throw out my perfectly working apple
| products that have years left in them and switch re-sync
| my cables.
|
| That is great you spent the money for this, but I'm not
| ready to throw away my perfectly fine devices.
| ChrisMarshallNY wrote:
| Reminds me of something...[0]
|
| [0] https://foreverstudios.com/betamax-vs-vhs/
| lxgr wrote:
| Have you ever had one fail on an Apple device?
|
| The accurate comparison here isn't between random low-
| budget USB-C implementations and Lightning on iPhones,
| but between USB-C and Lightning both on iPhones, and as
| far as I can tell, it's holding up nicely.
| xlii wrote:
| I have multiple ones. They accumulate dust easily and get
| damaged much more often. You won't see that if 99% of the
| time is in clean environment. As of today I have 3
| iPhones that won't charge on the wire. Those are physical
| damages so it's not like anything covers it (and Apple
| Care is not available in Poland). Same happens with the
| cables. I'm replacing USB-C display cable (that's
| lightning I suppose) every year now because they get
| loose and start to disconnect if you sneeze in the
| vicinity.
|
| I despise USB-C with all my heart. Amount of cable trash
| has tripled over the years.
| lostlogin wrote:
| Maybe try wireless charging.
|
| I find it superior to both lightning and USB-C.
| xlii wrote:
| I do, I rarely use USB-C on Apple devices (well outside
| of Mac). Wireless is great and I have nice looking night
| clock and moving files/other stuff over airdrop works
| good enough for me not plugging anything in. Recently I
| had to charge over the phone which required removal of
| pocket lint beforehand.
| khimaros wrote:
| with USB-C, the fragile end is on the cable instead of
| the port. that is a design feature.
| com2kid wrote:
| I thought so, but my Pixel 9 USB-C port falls out of
| place now after less than a year. :(
| shadowgovt wrote:
| Same problem. This may be the last Pixel I own because
| two in a row now have lost their USB-C sockets.
| linotype wrote:
| What are you doing to your USB-C devices? I've owned
| dozens and never had a single port break.
| Our_Benefactors wrote:
| It's a huge upgrade on the basis of allowing me to remove
| lightning cables from my life
|
| On a specsheet basis it also charges faster and has a
| higher data transmission rate.
|
| Lightning cables are not more robust. They are known to
| commonly short across the power pins, often turning the
| cable into an only-works-on-one-side defect. I replaced
| at least one cable every year due to this.
| shadowgovt wrote:
| Lightning was basically what happened when Apple got
| tired of waiting for the standards committee to converge
| on what USB-C would be, so they did their own.
|
| And... yeah, it turned out better than the standard.
| Their engineers have really good taste.
| WA wrote:
| And then the rest of the world got tired of Apple not
| proposing this supposedly superior piece of engineering
| as a new standard... because of licensing shenanigans.
| bowsamic wrote:
| Because I can charge my iPhone, my AirPods, and my mac,
| all with the same charger
| the_duke wrote:
| The EU has a certain amount of jurisdiction over all
| companies providing a service to customers located in the EU.
|
| A US company can always stop serving EU customers if it
| doesn't want to comply with EU laws, but for most the market
| is too big to ignore.
| shadowgovt wrote:
| So when one government compels an action another forbids,
| that's an international political situation.
|
| There is no supreme law at that level; the two nations have
| to hash it out between them.
| KingOfCoders wrote:
| EU-US Data Privacy Framework is an US scam to get European user
| data.
| wkat4242 wrote:
| It totally is that's why it keeps being shot down by the
| courts and relaunched under a different name.
|
| But the EU willingly participates in this. Probably because
| they know there's no viable alternative for the big clouds.
|
| This is coming now though since the US instantly.
| Y_Y wrote:
| I know I viable alternative to the big clouds.
|
| Don't use them! They cost too much in dollar terms, they
| all try to EEE lock you in with "managed" (and subtly
| incompatible) versions of services you would otherwise run
| yourself. They are too big to give a shit about laws or
| customer whims.
|
| I have plenty of experience with the big three clouds and
| given the choice I'll run locally, or on e.g. Hetzner, or
| not at all.
|
| My company loves to penny-pinch things like lunch
| reimbursement and necessary tools, but we piss away latge
| sums on unused or under-used cloud capacity with glee, be
| ause that is magically billed elsewhere (from the point of
| view of a given manager).
|
| It's a racket, and I'm by mo means the first to say so. The
| fact that this money-racket is also a dat-racket doesn't
| surprise me in the least. It's just good racketeering!
| wkat4242 wrote:
| Yes. Cloud is great for a very specific usecase. Venture
| capital startups which either go viral and explode, or
| die within a year. In those cases you need the capacity
| to automatically scale and also to only pay for the
| resources you actually use so you the service pays for
| itself. You also have no capex and you can drop the costs
| instantly if you need to close up shop. For services that
| really need infinite and instant scaling and flexibility,
| cloud is a genuinely great option.
|
| However that's not what most traditional companies do.
| What my employer does, is picking up the physical servers
| they had in our datacenters, dump them on an AWS compute
| box they run 24/7 without any kind of orchestration and
| call it "cloud". That's not what cloud is, that is really
| just someone else's computer. We spend a LOT more now but
| our CIO wanted to "go cloud" because everyone is so it
| was more a tickbox than a real improvement.
|
| Microservices, object storage etc, that is cloud.
| KingOfCoders wrote:
| "But the EU willingly participates in this."
|
| Parts of the European Commission "influenced" by lobbyists
| collude with the US.
| wkat4242 wrote:
| Sorry "instantly" should have been "is in chaos".
| Autocorrect...
| Aeolos wrote:
| And just like that, OpenAI got banned in my company today.
|
| Good job.
| PeterStuer wrote:
| Don't tell them all their other communication is intercepted
| and retained on the same basis. Good luck running your
| business in full isolation.
| DaiPlusPlus wrote:
| > OpenAI got banned in my company today
|
| Sarbanes-Oxley would like a word.
| Y_Y wrote:
| Do you mean to say that Sarbox might preclude this? Or that
| it should have been banned already? The meaning isn't clear
| to me and I would be grateful for further explanation.
| Ekaros wrote:
| Time to hit them with that 4% fine on revenue while they still
| have some money...
| Frieren wrote:
| > Spicy. European courts and governments will love to see their
| laws and legal opinions being shrugged away in ironic quotes.
|
| The GDPR allows to retain data when require by law as long as
| needed. People that make regulations may make mistakes
| sometimes, but they are no that stupid as to not understand the
| law and what things it may require.
|
| The data was correctly deleted on user demand. But it cannot be
| deleted where there is a Court order in place. The conclusion
| of "GDPR is in conflict with the law" looks like rage baiting.
| _Algernon_ wrote:
| It's questionable to me whether a court order of a non-eu
| court applies. "The law" is EU law, not American law.
|
| If any non-eu country can circumvent GDPR by just making a
| law that it doesn't apply, the entire point of the regulation
| vanishes.
| kbelder wrote:
| Doesn't that work both ways? Why should the EU be able to
| override American laws regarding an American company?
| FeepingCreature wrote:
| Because it's European users whose data is being recorded
| on the order of a court that doesn't even have
| jurisdiction over them?
| midasz wrote:
| It doesn't really matter from what country the company
| is. If you do business in the EU then EU laws apply to
| the business you do in the EU. Just like EU companies
| adhere to US law for the business they do in the US.
| _Algernon_ wrote:
| Because EU has jurisdiction when the american company
| operates in the EU.
| thephyber wrote:
| It's WAY more complicated than that.
|
| Where is the HQ of the company?
|
| Where does the company operate?
|
| What country is the individual user in?
|
| What country do the servers and data reside in?
|
| Ditto for service vendors who also deal with user data.
|
| Even within the EU, this is a mess and companies would
| rather use a simple heuristic like put all servers and
| store all data for EU users in the most restrictive
| country (I've heard Germany).
| _Algernon_ wrote:
| Maybe when talking about the GDPR specifics, but not when
| it comes to whether the EU has jurisdiction over
| companies in the EU.
| throw_a_grenade wrote:
| > Where is the HQ of the company?
|
| If outside EU, then they need to accept EU jurisdiction
| and notify who is representative plenipotentiary (== can
| make decisions and take liability on behalf of the
| company).
|
| > Where does the company operate?
|
| Geography mostly doesn't matter as long as they interact
| with EU people. Because people are more important.
|
| > What country is the individual user in?
|
| Any EU (or EEA) country.
|
| > What country do the servers and data reside in?
|
| Again, doesn't matter, because people > servers.
|
| It's almost like if bureaucrats who are writing
| regulations are experienced in writing regulations in
| such a way they can't be circumvented.
|
| EDIT TO ADD:
|
| From OpenAI privacy policy:
|
| > 1. Data controller
|
| > If you live in the European Economic Area (EEA) or
| Switzerland, OpenAI Ireland Limited, with its registered
| office at 1st Floor, The Liffey Trust Centre, 117-126
| Sheriff Street Upper, Dublin 1, D01 YC43, Ireland, is the
| controller and is responsible for the processing of your
| Personal Data as described in this Privacy Policy.
|
| > If you live in the UK, OpenAI OpCo, LLC, with its
| registered office at 1960 Bryant Street, San Francisco,
| California 94110, United States, is the controller and is
| responsible for the processing of your Personal Data as
| described in this Privacy Policy.
| Y_Y wrote:
| As you astutely note, the company probably has it's "HQ"
| (for some legal definition of HQ) a mere 30 minutes
| across Dublin (Luas, walk in rain, bus, more rain) from
| the Data Protection Commission. It's very likely that
| whatever big tech data-hoarder you choose has a presence
| very close to their opposite number in both of these
| cases.
|
| If it was easier or more cost-effective for these
| companies not to have a foot in the EU they wouldn't
| bother, but they do.
| chris12321 wrote:
| > It's almost like if bureaucrats who are writing
| regulations are experienced in writing regulations in
| such a way they can't be circumvented.
|
| Americans often seem to have the view that lawmakers are
| bumbling buffoons who just make up laws on the spot with
| no thought given to loop holes or consequences. That
| might be how they do it over there, but it's not really
| how it works here.
| Scarblac wrote:
| They can't override laws of course, but it could mean
| that if two jurisdictions have conflicting laws, you
| can't be active in both of them.
| mattlondon wrote:
| Likewise, why should America be able to override European
| laws regarding European users in Europe?
|
| It's all about jurisdiction. Do business in Country X?
| Then you need to follow Country X's laws.
|
| Same as if you go on vacation to County Y. If you do
| something that is illegal in Country Y while you are
| there, even if it's legal in your home country, you still
| broke the law in Country Y and will have to face the
| consequences.
| lmm wrote:
| Because we're talking about the personal data of EU
| citizens. If it's to be permitted to be sent to America
| at all, that must come with a guarantee that EU-standard
| protections will continue to apply regardless of American
| law.
| bonoboTP wrote:
| > If it's to be permitted to be sent to America at all
|
| Do you mean that I, an EU citizen am being granted some
| special privilege from EU leadership to send my data to
| the US?
| throw_a_grenade wrote:
| No, the company you're sending it to is required to care
| for it. Up to and including refusing to accept that data
| if need be.
| wkat4242 wrote:
| It's the other way around. The EU has granted US
| companies a temporary permission to handle EU customers'
| data. https://en.m.wikipedia.org/wiki/EU%E2%80%93US_Data_
| Privacy_F...
|
| I say temporary because it keeps being shot down in court
| for lax privacy protections and the EU keeps refloating
| it under a different name for economic reasons. Before
| this name it was called safe harbor and after that it was
| privacy shield.
| andrecarini wrote:
| It works the other way around; the American company is
| granted a special privilege to retrieve EU citizen data.
| bonoboTP wrote:
| I'm not sure they are "retrieving" data. People register
| on the website and upload stuff they want to be processed
| and used.
|
| I mean, sometimes the government steps in when you
| willingly try to hand over something on your own will,
| such as very strict rules around organ donation, I can't
| simply decide to give my organs to some random person for
| arbitrary reasons even if I really want to. But I'm not
| sure if data should be the same category where the
| government steps in and says "no you can't upload your
| personal data to an American website"
| lmm wrote:
| Of course you don't need permission to do something with
| your own data. But if someone wants to process _other
| people 's_ data, that's absolutely a special privilege
| that you don't get without committing to appropriate
| safety protocols.
| Garlef wrote:
| You don't understand how that works:
|
| EU companies are required to act in compliance with the
| GDPR. This includes all sensitive data that is transfered
| to business partners.
|
| They must make sure that all partners handle the
| (sensitive part of the) transfered data in a GDPR
| compliant way.
|
| So: No law is overriden. But in order to do business with
| EU companies, US companies "must" offer to treat the data
| accordingly.
|
| As a result, this means EU companies can not transfer
| sensitive data to US companies. (Since the president of
| the US has in principle the right to order any US company
| to turn over their data.)
|
| But in practice, usually no one cares. Unless someone
| does and then you might be in trouble.
| blitzar wrote:
| Taps the sign ... US companies operating in the EU are
| subject to EU laws.
| Frieren wrote:
| > GDPR: "Any judgment of a court or tribunal and any
| decision of an administrative authority of a third country
| requiring a controller or processor to transfer or disclose
| personal data may only be recognized or enforceable if
| based on an international agreement..."
|
| That is why international agreements and cooperation is so
| important.
|
| Agreement with the United States on mutual legal
| assistance: https://eur-lex.europa.eu/legal-
| content/EN/TXT/?uri=legissum...
|
| Regulatory entities are quite competent and make sure that
| most common situations are covered. When some new situation
| arises an update to the treaty will be created to solve it.
| _Algernon_ wrote:
| Seems like the EU should be less agreeable with these
| kinds of treaties going forward. Though precedent is
| already set by the US that international agreements don't
| matter so arguably the EU should just ignore this.
| friendzis wrote:
| > Regulatory entities are quite competent and make sure
| that most common situations are covered.
|
| There's "legitimate interest", which makes the whole GDPR
| null and void. Every website nowdays has the "legitimate
| interest" toggled on for "track user across services",
| "measure ad performance" and "build user profile". And
| it's 100% legal, even though the official reason for GDPR
| to exist in the first place is to make these practices
| illegal.
| troupo wrote:
| "legitimate interest" isn't a carte blanche. Most of
| those "legitimate interest" claims are themselves illegal
| octo888 wrote:
| Legitimate interest includes
|
| - Direct Marketing
|
| - Preventing Fraud
|
| - Ensuring information security
|
| It's weasel words all the way down. Having to take into
| account "reasonable" expectations of data subjects etc.
| Allowed where the subject is "in the service of the
| controller"
|
| Very broad terms open to a lot of lengthy debate
| troupo wrote:
| None of these allow you to just willy-nilly send/sell
| info to third parties. Or use that data for anything
| other than stated purposes.
|
| > Very broad terms open to a lot of lengthy debate
|
| Because otherwise no law would eve be written, because
| you would have to explicitly define every single possible
| human activity to allow or disallow.
| bryanrasmussen wrote:
| preventing fraud and info security are legitimate, direct
| marketing may be legitimate but probably is not.
|
| direct marketing that I believe is legitimate - offers
| with rebate on heightened service level if you currently
| have lower service level.
|
| direct marketing that is not legitimate, this guy has
| signed up for autistic service for our video service
| (silly example, don't know what this would be), therefore
| we will share his profile with various autistic service
| providers so they can market to him.
| friendzis wrote:
| > preventing fraud
|
| Fraud prevention is literally "collect enough cross-
| service info to identify a person in case we want to
| block them in the future". Weasel words for tracking.
|
| > therefore we will share his profile with various
| autistic service providers so they can market to him.
|
| This again falls under legitimate interest. The user,
| being profiled as x, may have legitimate interest in
| services targeting x. But we can't deliver this unless we
| are profiling users, so we cross-service profile users,
| all under the holy legitimate interest
| troupo wrote:
| > Fraud prevention is literally "collect enough cross-
| service info to identify a person in case we want to
| block them in the future". Weasel words for tracking.
|
| You're literally not allowed to store that data for
| years, or to sell/use that data for marketing and actual
| tracking purposes.
| octo888 wrote:
| And how funny - I just got an email from Meta about
| Instagram:
|
| "Legitimate interests is now our legal basis for using
| your information to improve Meta Products"
|
| Fun read https://www.facebook.com/privacy/policy?section_
| id=7-WhatIsO...
|
| But don't worry, "None of these allow you to just willy-
| nilly send/sell info to third parties." !
| octo888 wrote:
| Exactly. The ECJ flapped a bit in 2019 about this but
| then last year opined that the current interpretation
| "legitimate interest" by the Dutch DPA is too strict (on
| the topic of whether purely commercial interests counts)
|
| It's a farce and just like the US constitution they'll
| just continuously argue about the meanings of words and
| erode then over time
| danlitt wrote:
| "legitimate interest" is a fact about the data
| processing. It cannot be "toggled on". It also does not
| invalidate all other protections (like the prevention of
| data from leaving the EEA).
| friendzis wrote:
| https://www.reddit.com/media?url=https%3A%2F%2Fpreview.re
| dd....
| dotandgtfo wrote:
| None of those use cases are broadly thought of as
| legitimate interest and explicitly require some sort of
| consent in Europe.
|
| Session cookies and profiles on logged in users is where
| I see most companies stretching for legitimate interest.
| But cross service data sharing and persistent advertising
| cookies without consent are clearly no bueno.
| friendzis wrote:
| > But cross service data sharing and persistent
| advertising cookies without consent are clearly no bueno.
|
| https://www.reddit.com/media?url=https%3A%2F%2Fpreview.re
| dd....
| bryanrasmussen wrote:
| legitimate interest is, for example - have some way to
| identify user who is logged in. So keep email address for
| logged in users. Have some way to identify people who are
| trying to get account that have been banned, so have a
| table of banned users with email addresses for example.
|
| none of these others are legitimate interest. Furthermore
| combining the data from legitimate interest (email
| address to keep track of your logged in user) with
| illegitimate goals such as tracking across services would
| be illegitimate.
| pennaMan wrote:
| Basically, the GDPR doesn't guarantee your privacy at all.
| Instead, it hands it over to the state through its court
| system.
|
| Add to that the fact that the EU's heavy influence on the
| courts is a well-documented, ongoing deal, and the GDPR comes
| off as a surveillance law dressed up to seem the total
| opposite.
| Y_Y wrote:
| Quite right it doesnt _absolutely_ protect your privacy. I
| 'd agree that it's full of holes, but I do think it also
| contains effective provisions which assist with users
| controlling their data and data which identifies them.
|
| Which courts are influenced by the EU? I don't think it's
| true of US courts, and courts in EU nations are supposed to
| be influenced by it, it's in the EU treaties.
| albert_e wrote:
| Does this apply to OpenAI models served via Microsoft Azure.
| g42gregory wrote:
| Can somebody please post a complete list of these news
| organizations, demanding to see all of our ChatGPT conversations?
|
| I see one of them: The New York Times.
|
| We need to let people know who the other ones are.
| dijksterhuis wrote:
| https://originality.ai/blog/openai-chatgpt-lawsuit-list
| tgv wrote:
| Why?
| DaSHacka wrote:
| To know what subscriptions we need to cancel.
| tgv wrote:
| Yeah, shoot the messenger, that has always worked.
| g42gregory wrote:
| Usually, the messenger does not file lawsuits...
| crmd wrote:
| If you use ChatGPT or similar for any non-trivial purposes,
| future you is saying it's _essential_ that the chat logs do not
| map back to you as a human.
| bongodongobob wrote:
| Why? I honestly don't understand what's so bad about my chat
| logs vs google having all my emails.
| baby_souffle wrote:
| > Why? I honestly don't understand what's so bad about my
| chat logs vs google having all my emails.
|
| You might be a more benign user of chatGPT. Other people have
| turned it into a therapist and shared wildly intimate things
| with it. There is a whole cottage industry of journal apps
| that also now have "ai integration". At least some of those
| apps are using openAI on the back end...
| jacob019 wrote:
| I think the court overstepped by ordering OpenAI to save all user
| chats. Private conversations with AI should be protected - people
| have a reasonable expectation that deleted chats stay deleted,
| and knowing everything is preserved will chill free expression.
| Congress needs to write clear rules about what companies can and
| can't do with our data when we use AI. But honestly, I don't have
| much faith that Congress can get their act together to pass
| anything useful, even when it's obvious and most people would
| support it.
| amanaplanacanal wrote:
| If it's possible evidence as part of a lawsuit, of course they
| can't delete it.
| jacob019 wrote:
| A targeted order is one thing, but this applies to ALL data.
| My data is not possible evidence as part of a lawsuit, unless
| you know something I don't know.
| artursapek wrote:
| That's... not how discovery works
| jacob019 wrote:
| The government's power to compel private companies to
| preserve citizens' communications needs clear limits.
| When the law is ambiguous about these boundaries, courts
| end up making policy decisions that should come from
| Congress. We need legislative clarity that defines
| exactly when and how government can access private
| digital communications, not case-by-case judicial
| expansion of government power.
| artursapek wrote:
| My point is lawsuits make your data part of discovery
| retroactively. You aren't being sued right now, but
| perhaps you will be.
| lcnPylGDnU4H9OF wrote:
| Their point is that the discovery is asking for data of
| unrelated users. Necessarily so unless the claim is that
| all users who delete their chats are infringing.
| jacob019 wrote:
| Your point illustrates exactly why the tension between
| due process and privacy rights can't be fairly resolved
| by courts alone, since they have an inherent bias toward
| preserving their own discovery powers.
| nradov wrote:
| How did the court overstep? Orders to preserve evidence are
| routine in civil cases. Customer expectations about privacy
| have zero legal relevance.
| jacob019 wrote:
| Sure, preservation orders are routine - but this would be
| like ordering phone companies to record ALL calls just in
| case some might become evidence later. There's a huge
| difference between preserving specific communications in a
| targeted case and mass surveillance of every private
| conversation. The government shouldn't have that kind of
| blanket power over private communications.
| charonn0 wrote:
| > but this would be like ordering phone companies to record
| ALL calls just in case some might become evidence later
|
| That's not a good analogy. They're ordered to preserve
| records they would otherwise delete, not create records
| they wouldn't otherwise have.
| jacob019 wrote:
| They are requiring OpenAI to log API calls that would
| otherwise not be logged. I trust when OpenAI says they
| will not log or train on my sensitive business API calls.
| I trust them less to guard and protect logs of those API
| calls.
| jjk166 wrote:
| Change calls to text messages. The important thing is the
| keeping records of things unrelated to an open case which
| affect millions of people's privacy.
| Spivak wrote:
| I mean to be fair it is related to a current open case
| but the order is pretty ridiculous on its surface. It's
| feels different when the company and the employees
| thereof have to retain their own comms and documents, and
| that company must do the same for 3rd parties who are
| related but not actually involved in the lawsuit is a bit
| of a stretch.
|
| Why the NYT cares about a random ChatGPT user bypassing
| their paywall when an archive.ph link is posted on every
| thread is beyond me.
| nradov wrote:
| No, it wouldn't be like that at all. Phone companies and
| telephone calls are covered under a different legal regime
| so your analogy is invalid.
| ethagnawl wrote:
| Why is AI special in this regard? Why is my exchange with
| ChatGPT any more privileged than my DuckDuckGo search for _HIV
| test margin of error_?
| jacob019 wrote:
| You're right, it's not special.
|
| This is from DuckDuckGo's privacy policy: "We don't track
| you. That's our Privacy Policy in a nutshell. We don't save
| or share your search or browsing history when you search on
| DuckDuckGo or use our apps and extensions."
|
| If the court compelled DuckDuckGo to log all searches, I
| would be equally concerned.
| robocat wrote:
| DuckDuckGo uses Bing.
|
| It would be interesting to know how much Microsoft logs or
| tracks.
| sib wrote:
| That's a pretty significant difference, though.
|
| OpenAI (and other services) log and preserve your
| interactions, in order to either improve their service or
| to provide features to you (e.g., your chat history,
| personalized answers, etc., from OpenAI). If a court says
| "preserve all your user interaction logs," they exist and
| need to be preserved.
|
| DDG explicitly does not track you or retain any data about
| your usage. If a court says "preserve all your users
| interaction logs," there is nothing to be preserved.
|
| It is a very different thing - and a much higher bar - for
| a court to say "write code to begin logging user
| interaction data and then preserve those logs."
| ethagnawl wrote:
| I should have said "web search", as that's really what I
| meant -- DDG was just a convenient counterexample.
| webstrand wrote:
| OpenAI also claims to delete logs after 30 days if you've
| deleted them. Anything that you've deleted but hasn't
| been processed by OpenAI yet will now be open to
| introspection by the court.
| energy123 wrote:
| People upload about 100x more information about themselves to
| ChatGPT than search engines.
| raincole wrote:
| AI is not special and that's the _exact_ issue. The court
| made a precedence here. If OpenAI can be ordered to preserve
| all the logs, then DuckDuckGo can face the same issue even if
| they don 't want to do that.
| BrtByte wrote:
| The preservation order feels like a blunt instrument in a
| situation that needs surgical precision
| marcyb5st wrote:
| Would it be possible to comply with the order by anonymizing
| the data?
|
| The court is after evidence that users use ChatGPT to bypass
| paywalls. Anonymizing the data in a way that makes it
| impossible to 1) pinpoint the users and 2) reconstruct the
| generic user conversation history would preserve privacy and
| allow OpenAI to comply in good faith with the order.
|
| The fact that they are blaring sirens and hide behind the "we
| can't, think about users' privacy" feels akin to willingful
| negligence or that they know they have something to hide.
| Miraltar wrote:
| Anonymizing data is really hard and I'm not sure they'd be
| allowed to do it. I mean they're accused of deleting
| evidences, why would they be allowed to alter it ?
| lcnPylGDnU4H9OF wrote:
| > feels akin to willingful negligence or that they know they
| have something to hide
|
| Not at all; there is a presumption of innocence. Unless a
| given user is plausibly believed to be violating the law,
| there is no reason to search their data.
| pjc50 wrote:
| Consider the opposite prevailing, where I can legally protect
| my warez site simply by saying "sorry, the conversation where I
| sent them a copy of a Disney movie was private".
| lcnPylGDnU4H9OF wrote:
| If specific users are violating the law, then a court can and
| should order _their_ data to be retained.
| riskable wrote:
| The legal situation you describe is a matter of
| _impossibility_ and unrelated to the OpenAI case.
|
| In the case of a warez site they would never have logged such
| a "conversation" to begin with. So if the court requested
| that they produce all such communications the warez site
| would simply declare that as, "Impossibility of Performance".
|
| In the case of OpenAI the courts are demanding that they
| preserve all _future_ communications from _all_ their end
| users--regardless of whether or not those end users are
| parties (or even relevant) to the case. The court is
| literally demanding that they re-engineer their product to
| record all communications where none existed previously.
|
| I'm not a lawyer but that seems like it would violate FRCP
| 26(b)(1) which covers "proportionality". Meaning: The effort
| required to record the evidence is not proportional relative
| to the value of the information sought.
|
| Also--generally speaking--courts recognize that a party is
| not required to create new documents or re-engineer systems
| to satisfy a discovery request. Yet that is exactly what the
| court has requested of OpenAI.
| NewJazz wrote:
| Increasingly irrelevant startup guards their moat.
| mark_l_watson wrote:
| what about when users check "private chat"? Probably need to keep
| that logged also?
|
| Does this pertain to Google Gemini, Meta chat, Anthropic, etc.
| also?
| lrvick wrote:
| There is absolutely no reason for these logs to exist.
|
| Run LLM in an enclave that generates ephemeral encryption keys.
| Have users encrypt text directly to those enclave ephemeral keys,
| so prompts are confidential and only ever visible in an
| environment not capable of logging.
|
| All plaintext data will always end up in the hands of governments
| if it exists, so make sure it does not exist.
| jxjnskkzxxhx wrote:
| Then a court will order that you don't encrypt. And probably go
| after you for trying to undermine the intent of previous court
| order. Or what, you thought you found an obvious loophole in
| the entire legal system?
| lrvick wrote:
| Yes. Because once you have remote attestation, anyone can
| host these enclaves in any country, and charge some tiny fee
| for their gpu time.
|
| Decentralize hosting and encryption then centralized
| developers of the open source software will be literally
| unable to comply.
|
| This well proven strategy would however only be possible if
| anything about OpenAI was actually open.
| paxys wrote:
| Encryption does not negate copyright laws. The solution here is
| for LLM builders to pay for training data.
| ronsor wrote:
| The solution here is to get rid of copyright.
| TechDebtDevin wrote:
| Do you have any reading on this?
| moshegramovsky wrote:
| Maybe a lawyer can correct me if I'm wrong, but I don't
| understand why some people in the article appear to think that
| this is causing OpenAI to breach their privacy agreement.
|
| The privacy agreement is a contract, not a law. A judge is well
| within their rights to issue such an order, and the privacy
| agreement doesn't matter at all if OpenAI has to do something to
| comply with a lawful order from a court of competent
| jurisdiction.
|
| OpenAI are like the new Facebook when it comes to spin.
| naikrovek wrote:
| Yep. Laws supersede contracts. Contracts can't legally bind any
| entity to break the law.
|
| Court orders are like temporary, extremely finely scoped laws,
| as I understand them. A court order can't compel an entity to
| break the law, but it can compel an entity to behave as if the
| court just set a law (for the specified entity, for the
| specified period of time, or the end of the case, whichever is
| sooner).
| unyttigfjelltol wrote:
| If I made a contract with OpenAI to keep information
| confidential, and the newspaper demanded access, via Court
| discovery or otherwise, then both the Court and OpenAI
| definitely should be attentive to my rights to intervene and
| protect the privacy of my confidential information.
|
| Normally Courts are oblivious to advanced opsec, which is one
| fundamental reason they got breached, badly, a few years ago.
| I just saw a new local order today on this very topic.[1]
| Courts are just waking up to security concepts that have been
| second nature to IT professionals.
|
| From my perspective, the magistrate judge here made two major
| goofs: (1) ignoring opsec as a reasonable privacy right for
| customers of an internet service and (2) essentially
| demanding that several hundred million of them intervene in
| her court to demand that she respect their ability to protect
| their privacy.
|
| The fact that the plaintiff is the news organization half the
| US loves to hate does not help, IMO. Why would that half of
| the country trust some flimsy "order" to protect their most
| precious secrets from an organization that lives and breathes
| cloak-and-dagger leaks and political subterfuge. NYT needed
| to keep their litigation high and tight and instead they
| drove it into a ditch with the help of a rather disappointing
| magistrate.
|
| [1] https://www.uscourts.gov/highly-sensitive-document-
| procedure...
| energy123 wrote:
| Local American laws supersede the contract law operating in
| other countries that OpenAI is doing business in?
| naikrovek wrote:
| local country laws supersede contract law in that country,
| as far as i am aware.
|
| us law does not supersede foreign contract law. how would
| that even work? why would you think that was possible?
| jjk166 wrote:
| A court order can be a lawful excuse for non-performance of a
| contract, but it's not always the case. The specifics of the
| contract, the court order, and the jurisdiction matter.
| wglb wrote:
| I have a friend who is a Forensic Attorney (certified Forensic
| Examiner and licensed attorney). He says "You folks are putting
| all of this subpoenable information on Google and other sites"
| husky8 wrote:
| https://speaksy.chat/
|
| Speaksy to the rescue for privacy and curiosity (easy-access
| jailbroken qwen3 8B in free tier, 32B in paid) May quickly hit
| capacity since it's all locally run for privacy.
| Jordan-117 wrote:
| Seems shortsighted to offer something like this with zero
| information about how it works. If your target market is
| privacy-conscious, slapping a "Privacy by Design" badge and
| some vague promises is probably not very convincing. (Also, the
| homepage claims "Your conversations remain on your device and
| are never sent to external servers," yet the ProductHunt page
| says "All requests are handled locally on my own farm and all
| data is burned" -- which is it?)
| PoachedEggs wrote:
| I wonder how this will square with business customers in the
| healthcare space that OpenAI signed a BAA with.
| ianks wrote:
| This ruling is unbelievably dystopian for anyone that values a
| right to privacy. I understand that the logs will be useful in
| the occasional conviction, but storing a log of people's most
| personal communications is absolutely not a just trade.
|
| To protect their users from the this massive overreach, OpenAI
| should defy this order and eat the fines IMO.
| yard2010 wrote:
| It's almost rigged. Either they are keeping the data (and ofc
| making money out of it) or deleting it destroying the evidence
| of the crimes they're committing..
| imiric wrote:
| This is a moot issue. OpenAI and all AI service providers
| already use all user-provided data for improving their models,
| and it's only a matter of time until they start selling it to
| advertisers, if they don't already. Whether or not they
| actually delete chat conversations is irrelevant.
|
| Anyone concerned about their privacy wouldn't use these
| services to begin with. The fact they are so popular is
| indicative that most people value the service over their
| privacy, or simply don't care.
| wongarsu wrote:
| Plenty of service providers (including OpenAI) offer you the
| option to kindly ask them not to, and will even contractually
| agree not to use or sell your data if you want such an
| agreement.
|
| Yes, they want to use everyone's data. But they also want
| everyone as a customer, and they can't have both at once.
| Offering people an opt-out is a popular middle-ground because
| the vast majority of people don't care about it, and those
| that do care are appeased
| malwrar wrote:
| They will do it when they need the money and/or feel they
| have the leverage for precisely the same reason that 99% of
| people won't care. It's better to assume they're just
| sitting on your data and waiting until they can get away
| with using it.
| imiric wrote:
| That's nice. How can a user verify whether they fully
| comply with those contracts?
|
| They have every incentive not to, and no oversight to hold
| them accountable if they don't. Do you really want to trust
| your data is safe based on a pinky promise from a company?
| grumpyinfosec wrote:
| You sue them and win damages? Courts tend to uphold
| contracts at face value.
| thewebguyd wrote:
| > The fact they are so popular is indicative that most people
| value the service over their privacy, or simply don't care.
|
| Or, the general populace just doesn't understand the actual
| implications. The HN crowd can be guilty of severely
| overestimating the average person's tech literacy, and
| especially their understanding of privacy policies and ToS.
| Many may think they are OK with it, but I'd argue it's
| because they don't understand the potential real-world
| consequences of such privacy violations.
| romanovcode wrote:
| This has nothing to do with convictions of criminals but
| everything with CIA gathering profiles every single person they
| can.
| outside1234 wrote:
| Using ChatGPT to skirt paywalls? That's the reason for this?
| darkoob12 wrote:
| It should be possible to inquire about a fabrication of AI
| models.
|
| Let's say someone creates Russian propaganda with thesemodels or
| create fraudulent documents.
| JimDabell wrote:
| Should a word processor keep records of all the text it ever
| edited in case it was used to create propaganda? What about
| Photoshop?
| darkoob12 wrote:
| But we are living in a different world. Now, these tools can
| create material more compelling than reality.
| iammrpayments wrote:
| I see a lot of successful people on social media saying that they
| share their whole life to chatGPT using voice before sleeping. I
| wonder what they think about this.
| JCharante wrote:
| Disgusting move from the NYT
| mseri wrote:
| Some more details here: https://arstechnica.com/tech-
| policy/2025/06/openai-says-cour...
|
| And here are the links to the court irders and responses if you
| are curious:
| https://social.wildeboer.net/@jwildeboer/114530814476876129
| xivzgrev wrote:
| I use a made up email for chat gpt, fully expecting that a)
| openai saves all my conversations and b) it will someday get
| hacked
| DaSHacka wrote:
| This, and I ensure to anonymize any information I feed it, just
| in case. (Mostly just swapping out names / locations for
| placeholders)
| ppsreejith wrote:
| Doesn't chat gpt require a phone number?
| dlivingston wrote:
| Is there a service available to request API keys for LLMs? Not
| directly from OpenAI or Anthropic, where your real identity /
| organization is tied to the key, but some third-party service
| that acts as a "VPN" / proxy?
| karlkloss wrote:
| In theory, they could get into serious legal trouble, if a user
| or chatgpt writes child pornography. The posession alone is a
| felony in many countries, even if it is completely fictional
| work. As are links to CP sites.
| KnuthIsGod wrote:
| Palantir will put the logs to "good" use...
|
| At least the Chinese are open about their authoritarian system
| and constant snooping on users.
|
| Time to switch to Chinese AI. It can't be any worse than using
| American AI.
| romanovcode wrote:
| Objectively it's better - Chinese authorities cannot prosecute
| you and only way you will get into trouble if the Chinese are
| sharing the data with USA.
| bravesoul2 wrote:
| What about Azure? API?
|
| Anyway the future is open models.
| adwawdawd wrote:
| Always good to remember that you can't destroy data that you
| didn't create in the first place.
| BrtByte wrote:
| Feels like the court's going with a "just in case" fishing
| expedition, and it's landing squarely on end users. Hope there's
| a better technical or legal compromise coming
| theyinwhy wrote:
| If I read this correctly, I personally can be found guilty of
| copyright infringement because of my chat with the AI? Why am I
| to blame for the answers the AI provides? Can someone elaborate,
| what am I missing?
| DaSHacka wrote:
| Easiest way for both the rights holders and AI megacorpos to
| both be happy is to push all the responsiblity onto the average
| joe who can't afford as expensive lawyers/lobbyists.
| paxys wrote:
| No you aren't reading this correctly
| rwyinuse wrote:
| No point using OpenAI when so many safer alternatives exist.
| Mistral AI, for instance, is based in Europe, in my experience
| their models work just as well OpenAI.
| badsectoracula wrote:
| AFAIK companies can pay Mistral to install their AI on their
| premises too, so they can have someone to ~~blame~~ provide
| support too :-P
| DaSHacka wrote:
| > No point using OpenAI when so many safer alternatives exist.
|
| And ironically, this now includes the Chinese AI companies too.
|
| Old bureaucratic fogeys will be the death of this nation.
| PeterStuer wrote:
| Isn't this a normal part of lawfull intercept and data retention
| regulation?
|
| Why would that not apply to LLM chat services?
| mkbkn wrote:
| Does that mean that sooner or later US-based LLM companies will
| also be required to do so?
| alpineman wrote:
| Cancelling my NYTimes subscription - they are massively
| overreaching by pushing for this
| cedws wrote:
| Not that it makes it any better but I wouldn't be surprised if
| the NSA had a beam splitter siphoning off every byte going to
| OpenAI already. Don't send sensitive data.
| exq wrote:
| The former head of the NSA is on the board. I'd be more
| surprised if the feds WEREN'T siphoning off every byte by now.
| atoav wrote:
| Ever since the patriot act I operate under the assumption that if
| a service is located in the US it is eventually already in acrive
| collaboration in by the US government or in the process of being
| forced to do so. Therefore any such service is out of the
| question for work-related stuff.
| junon wrote:
| Maybe this is in TFA but I can't see it, does this affect
| European users? Are they only logging everything in the US?
| glookler wrote:
| I don't know, but the claim pertains to damages from helping
| users bypass paywalls. Assuming a European can pay for many US
| sites this isn't a situation where the location of the user
| relates to the basis.
| HPsquared wrote:
| Worldwide, I think.
| singularity2001 wrote:
| Doesn't the EU pose very strict privacy requirements now? I
| mean NSA knows more about you than your mom and the Stasi ever
| dreamt of but that's not part of the court order.
| msgodel wrote:
| How is anyone surprised by this?
| mjgant wrote:
| I wonder how this affects Microsoft's privacy policy for Azure
| Open AI?
|
| https://learn.microsoft.com/en-us/legal/cognitive-services/o...
| DrScientist wrote:
| It's a day to day legal reality - that normal businesses live
| with every day - that if you are in any sort of legal dispute,
| particularly over things like IP, that legal hold ( don't delete
| ) get's put on anything that's likely relevant.
|
| It's entirely reasonable, and standard practice for courts to say
| 'while the legal proceedings is going on don't delete potential
| evidence relevant to the case'.
|
| More special case whining from tech bro's - who don't seem to
| understand the basic concepts of fairness or justice.
| baalimago wrote:
| What does "slam" even mean in this context..?
| seb1204 wrote:
| Click on it. To be honest I'm totally over the over hyped
| sensational headlines I see on most post or pages.
| romanovcode wrote:
| "slam" usually means that the article was written using AI.
| josh2600 wrote:
| End to end encryption of these systems cannot come soon enough!
| eterm wrote:
| Why would e2e help here? The other end is the one that's
| ordered to preserve the logs.
| rkagerer wrote:
| This highlights a significance today's cloud-addicted generation
| seems to completely ignore: who has control of your data.
|
| I'm not talking about contractual control (which is largely
| mooted as pretty much every cloud service has a ToS that's
| grossly skewed toward their own interests over yours, with
| clauses like indemnifications, blanket grants to share your data
| with "partners" without specifying who they are or precisely what
| details are conveyed, mandatory arbitration, and all kinds of
| other exceptions to what you'd consider respectful decency), but
| rather where your data lives and is processed.
|
| If you truly want to maintain confidence it'll remain private,
| don't send it to the cloud in the first place.
| fireflash38 wrote:
| Yeah, I see a ton of people all up in arms about privacy but
| ignoring that OpenAI doesn't give a rats ass about others
| privacy (see: scraping).
|
| Like why one good other bad?
| daveoc64 wrote:
| If something is able to be scraped, it isn't private.
|
| There is no technical reason for chats people have with
| ChatGPT or any similar service to be available on the web to
| everyone, so there is no way for them to be scraped.
| brookst wrote:
| It's not zero sum. I can believe that openai does not take
| privacy seriously enough _and also_ that I don't want every
| chat I've ever had with their product to be entered into the
| public record.
|
| "If one is good the other must be good" is far too simplistic
| thinking to apply to a situation like this.
| fireflash38 wrote:
| I personally just can't fathom the logic that sending
| something so private and critical to OpenAI is ok, but to
| have courts view it is not? Like if it's so private, why in
| hell would you give it to a company that has shown that it
| cares not at all about others privacy?
| brookst wrote:
| Interesting. It seems obvious to me.
|
| I've asked ChatGPT medical things that are private but
| not incriminating or anything, because I trust ChatGPT's
| profit motive to just not care about my individual
| issues. But I would be pretty irritated if the government
| stepped in and mandated they make my searches public and
| linkable to me.
|
| Are you perhaps taking an absolutist view where anything
| less than perfect attention to all privacy is the same as
| making all logs of everyone public?
| Xelynega wrote:
| > But I would be pretty irritated if the government
| stepped in and mandated they make my searches public and
| linkable to me.
|
| Who is calling for this? Are you perhaps taking an
| absolutist view where "not destroying evidence" is the
| same as "mandated they make my searches public and
| linkable to me"? That's quite ridiculous.
| ahmeneeroe-v2 wrote:
| This seems like an unhelpful extension of the word "privacy".
| Scraping is something, but it is mostly not a privacy
| violation.
| jpadkins wrote:
| the post does not reflect the reality that it is not 'your
| data'*. When you use a service provider, it's their data. They
| may give you certain rights to influence your usage or payment
| of their service, but if it's not on machines you control then
| it's not your data.
|
| *my legal argument is "possession is 9/10ths of the law"
| grafmax wrote:
| Framing this as a moralized issue of "addiction" on the part of
| consumers naturalizes the structural cause of the problem.
| Concentrated wealth benefits from the cloud capital consumers
| generate. It's this group that has most of the control over how
| our data is collected. These companies reduce the number and
| quality of choices available to us. Blaming consumers for
| choosing the many conveniences of cloud data when the incentive
| structure has been carefully tailored to funnel our data into
| their possession and control is truly a superficial take.
| keybored wrote:
| Well put. And generalizes to most consumer-blaming.
| bunderbunder wrote:
| And this kind of consumer-blaming ultimately serves the
| interests of the very people who benefit most from the
| status quo, but shifting attention away from them and
| toward people who are easy to pick on but ultimately have
| very little control over the situation. For most people,
| opting out of the cloud is tantamount to opting out of
| modern society.
|
| I can't even get important announcements from my kids'
| school without signing up for yet another cloud service.
| next_xibalba wrote:
| This is such a wildly ahistorical and conspiratorial take on
| how the cloud evolved that it could only have come from a
| Marxist.
| KaiserPro wrote:
| > If you truly want to maintain confidence it'll remain
| private, don't send it to the cloud in the first place.
|
| I mean yes. but if _you_ host it, then you 'll be taken to
| court to hand that data over. Which means that you'll have less
| legal talent at your disposal to defend against it.
| lcnPylGDnU4H9OF wrote:
| > but if you host it, then you'll be taken to court to hand
| that data over.
|
| Not in this case. The Times seems to be claiming that OpenAI
| is infringing rather than any particular user. If one does
| not use OpenAI then their data is not subject to this.
| throwaway290 wrote:
| You don't have control over your data in the eyes of these
| guys... this was clear as soon as they started training their
| LLM on it without asking you
| nashashmi wrote:
| We have shifted over to SaaS so much for convenience that we
| have lost sight of "our control".
|
| I imagine a 90s era software industry for today's tech world:
| person buys a server computer, person buys an internet
| connection with stable ip, person buys server software boxes to
| host content on the internet, person buys security suite to
| firewall access.
|
| Where is the problem in this model? Aging computers?
| Duplicating computing hardware for little use? Unsustainable?
| Not green/efficient?
| SpaceL10n wrote:
| > who has control of your data
|
| As frustrating as it is, the answer seems to be everyone and no
| one. Data in some respects is just an observation. If I walk
| through a park, and I see someone with red hair, I just
| collected some data about them. If I see them again, perhaps
| strike up a conversation, I learn more. In some sense, I own
| that data because I observed it.
|
| On the other other hand, I think most decent people would agree
| that respecting each other's right to privacy is important.
| Should the owner of the red hair ask me to not share personal
| details about them, I would gladly accept, because I personally
| recognize them as the owner of the source data. I may possess
| an artifact or snapshot of that data, but it's their hair.
|
| In a digital world where access controls exist, we have an
| opportunity to control the flow of our data through the public
| space. Unfortunately, a lot of work is still needed to make
| this a reality...if it's even possible. I like the Solid
| Project for it's attempt to rewrite the internet to put more
| control in the hands of the true data owners. But, I wonder if
| my observation metaphor is still possible even in a system like
| Solid.
| tarr11 wrote:
| How do consumers utilize expensive compute resources in this
| model? Eg, H100 GPUs.
| sneilan1 wrote:
| It's not developer's common interest to develop local services
| first. People build cloud services for a profit so they can be
| paid. However, sometimes developers need to build their
| portfolios (or out of pure interest) so they make software that
| runs locally anyway. A lot of websites can easily be ran on
| people's computers from a data perspective but it's a lot of
| work to get that data in the first place and make it useable. I
| don't think people are truly "cloud-addicted". I think they
| simply do not have other choices.
| sinuhe69 wrote:
| Why could a court favor the interest of the New York Times in a
| vague accusation versus the interest and right of hundred
| millions people?
|
| Billion people use the internet daily. If any organization
| suspects some people use the Internet for illicit purposes
| eventually against their interests, would the court order the ISP
| to log all activities of all people? Would Google be ordered to
| save the search of all its customers because some might use it
| for bad things? And once we start, where will we stop? Crimes
| could happen in the past or in the future, will the court order
| the ISP and Google to retain the logs for 10 years, 20 years? Why
| not 100 years? Who should bear the cost for such outrageous
| demands?
|
| The consequences of such orders are of enormous impact the puny
| judge can not even begin to comprehend. Privacy right is an
| integral part of the freedom of speech, a core human right. If
| you don't have private thoughts, private information, anybody can
| be incriminated against them using these past information. We
| will cease to exist as individuals and I argue we will cease to
| exist as human as well.
| fireflash38 wrote:
| In your arguments for privacy, do you consider privacy from
| OpenAI?
| rvnx wrote:
| Cut a joke about ethics and OpenAI
| ethersteeds wrote:
| He is what now?! That is a risible claim.
| nindalf wrote:
| He was being facetious.
| ethersteeds wrote:
| Alas, it was too early
| maest wrote:
| Original comment, lest the conversation chain does not make
| sense
|
| > Sam Altman is the most ethical man I have ever seen in
| IT. You cannot doubt he is vouching and fighting for your
| privacy. Especially on YCombinator website where free
| speech is guaranteed.
| humpty-d wrote:
| I fail to see how saving all logs advances that cause
| hshdhdhj4444 wrote:
| Because this is SOP in any judicial case?
|
| Openly destroying evidence isn't usually accepted by
| courts.
| brookst wrote:
| Is there any evidence of import that would only be found
| in one single log among billions? The fact that NYT
| thinks that merely sampling 1% of logs would not support
| their case is pretty damning.
| fluidcruft wrote:
| I don't know anything about this case but it has been
| alleged that OpenAI products can be coaxed to return
| verbatim chunks of NYT content.
| brookst wrote:
| Sure, but if that is true, what is the evidentiary
| difference between preserving 10 billion conversations
| and preserving 100,000 and using sampling and statistics
| to measure harm?
| fluidcruft wrote:
| The main differences seem to be that it doesn't require
| the precise form of the queries to be known a priori and
| that it interferes with the routine destruction of
| evidence via maliciously-compliant mealy-mouthed word
| games, for which the tech sector has developed a
| significant reputation.
|
| Furthermore there is no conceivable harm resulting from
| requiring evidence to be preserved for an active trial.
| Find a better framing.
| ToValueFunfetti wrote:
| No conceivable harm in what sense? It seems obvious that
| it is harmful for a user who requests and is granted
| privacy to then have their private messages delivered to
| NYT. Legally it may be on shakier ground from the
| individual's perspective, but OpenAI argues that the harm
| is to their relationship with their customers and various
| governments, as well as the cost of the implementation
| effort:
|
| >For OpenAI, risks of breaching its own privacy
| agreements could not only "damage" relationships with
| users but could also risk putting the company in breach
| of contracts and global privacy regulations. Further, the
| order imposes "significant" burdens on OpenAI, supposedly
| forcing the ChatGPT maker to dedicate months of
| engineering hours at substantial costs to comply, OpenAI
| claimed. It follows then that OpenAI's potential for harm
| "far outweighs News Plaintiffs' speculative need for such
| data," OpenAI argued.
| sib wrote:
| >> It seems obvious that it is harmful for a user who
| requests and is granted privacy to then have their
| private messages delivered to NYT.
|
| This ruling is about preservation of evidence, not (yet)
| about delivering that information to one of the parties.
|
| If judges couldn't compel parties to preserve evidence in
| active cases, you could see pretty easily that parties
| would aggressively destroy evidence that might be harmful
| to them at trial.
|
| There's a whole later process (and probably arguments in
| front of the judge) about which evidence is actually
| delivered, whether it goes to the NYT or just to their
| lawyers, how much of it is redacted or anonymized, etc.
| baobun wrote:
| It's a honeypot from the beginning y'all
| capnrefsmmat wrote:
| Courts have always had the power to compel _parties to a
| current case_ to preserve evidence. (For example, this was an
| issue in the Google monopoly case, since Google employees were
| using chats set to erase after 24 hours.) That becomes an issue
| in the discovery phase, well after the defendant has an
| opportunity to file a motion to dismiss. So a case with no
| specific allegation of wrongdoing would already be dismissed.
|
| The power does not extend to any of your hypotheticals, which
| are not about active cases. Courts do not accept cases on the
| grounds that some bad thing might happen in the future; the
| plaintiff must show some concrete harm has already occurred.
| The only thing different here is how much potential evidence
| OpenAI has been asked to retain.
| lcnPylGDnU4H9OF wrote:
| So then the courts need to find who is setting their chats do
| be deleted and order them to stop. Or find specific
| infringing chatters and order OpenAI to preserve these
| specified users' logs. OpenAI is doing the responsible thing
| here.
| capnrefsmmat wrote:
| OpenAI is the custodian of the user data, so they are
| responsible. If you wanted the court (i.e., the plaintiffs)
| to find specific infringing chatters, first they'd have to
| get the data from OpenAI to find who it is -- which is
| exactly what they're trying to do, and why OpenAI is being
| told to preserve the data so they can review it.
| happyopossum wrote:
| So the courts should start ordering all ISPs, browsers,
| and OSs to log all browsing and chat activity going
| forward, so they can find out which people are doing bad
| things on the internet.
| Vilian wrote:
| Or you didn't read what was written by the other comment,
| or are just arguing in bad faith, what's even weierder
| because the guy was only explaining how the the system
| always worked
| lovich wrote:
| If those entities were custodians in charge of the data
| at hand in the court case, the court would order that.
|
| This post appears to be full of people who aren't
| actually angry at the results of this case but angry at
| how the US legal system has been working for decades,
| possibly centuries since I don't know when this precedent
| was first set
| scarab92 wrote:
| Is it not valid to be concerned about overly broad
| invasions of privacy regardless of how long such orders
| have been occurring?
| Retric wrote:
| What privacy specifically? The courts have always been
| able to compel people to recount things they know which
| could include a conversation between you and your plumber
| if it was somehow related to a case.
|
| The company records and uses this stuff internally,
| retention is about keeping information accurate and
| accessible.
|
| Lawsuits allow in a limited context the sharing of non
| public information held by individuals/companies in the
| lawsuit. But once you submit something to OpenAI it's now
| there information not just your information.
| nickff wrote:
| I think that some of the people here dislike (or are
| alarmed by) the way that the court can compel parties to
| retain data which would otherwise have vanished into the
| ether.
| lovich wrote:
| It's not private. You handed over the data to a third
| party.
| dragonwriter wrote:
| Its not an "invasion of privacy" for a company who
| already had data to be prohibited from destroying it when
| they are sued in a case where that data is evidence.
| dogleash wrote:
| Yeah, sure. But understanding the legal system tells us
| the players and what systems exist that we might be mad
| at.
|
| For me, one company obligated to retain business records
| during civil litigation against another company, reviewed
| within the normal discovery process is tolerable.
| Considering the alternative is lawlessness. I'm fine with
| it.
|
| Companies that make business records out of invading
| privacy? They, IMO, deserve the fury of 1000 suns.
| rodgerd wrote:
| If you cared about your privacy, why are you handing all
| this stuff to Sam Altman? Did he represent that OpenAI
| would be privacy-preserving? Have they taken any
| technical steps to avoid this scenario?
| dragonwriter wrote:
| No, they should not.
|
| However, if the ISP, for instance, is sued, then it
| (immediately and without a separate court order) becomes
| illegal for them to knowingly destroy evidence in their
| custody relevant to the issue for which they are being
| sued, and if there is a dispute about their handling of
| particular such evidence, a court can and will order them
| specifically to preserve relevant evidence as necessary.
| And, with or without a court order, their destruction of
| relevant evidence once they know of the suit can be the
| basis of both punitive sanctions and adverse findings in
| the case to which the evidence would have been relevant.
| dragonwriter wrote:
| > So then the courts need to find who is setting their
| chats do be deleted and order them to stop.
|
| No, actually, it doesn't. Ordering a party to stop
| destroying evidence relevant to a current case (which is
| its obligation _even without a court order_ ) irrespective
| of whether someone else asks it to destroy that evidence is
| both within the well-established power of the court, and
| routine.
|
| > Or find specific infringing chatters and order OpenAI to
| preserve these specified users' logs.
|
| OpenAI is the alleged infringer in the case.
| IAmBroom wrote:
| Under this theory, if a company had employees shredding
| incriminating documents at night, the court would have to
| name those employees before ordering them to stop.
|
| That is ridiculous. The company itself receives that order,
| and is IMMEDIATELY legally required to comply - from the
| CEO to the newest-hired member of the cleaning staff.
| MeIam wrote:
| Time does not need user logs to prove such a thing if it was
| true. Times can show that it is possible so they can show how
| their own users can access the text. Why would they need
| other user's data?
| KaiserPro wrote:
| > Time does not need user logs to prove such a thing if it
| was true.
|
| No it needs to show how often it happens to prove a point
| of how much impact its had.
| MeIam wrote:
| Why would that matter, if people didn't use it as much,
| does it mean that it doesn't matter if there were few
| people?
| delusional wrote:
| You have to argue damages. It actually has to have cost
| NYT some money, and for that you need to know some
| extent.
| MeIam wrote:
| We don't even know if Times uses AI to get information
| from other sources either. They can get a hint of news
| and then produce their material.
| cogman10 wrote:
| OpenAI is also entitled to discovery. They can literally
| get every email and chat the times has and require from
| this point on they preserve such logs
| delusional wrote:
| Who cares? That's not a legal argument and it doesn't
| mean anything to this case.
| lovich wrote:
| Oh, I was unaware that Times was inventing a novel
| technology with novel legal questions.
|
| It's very impressive they managed to do such innovation
| in their spare time while running a newspaper and site
| KaiserPro wrote:
| > We don't even know if Times uses AI to get information
| from other sources either
|
| which is irrelevant at this stage. Its a legal principle
| that both sides can fairly discover evidence. As finding
| out how much openAI has infringed copyright is pretty
| critical to the case, they need to find out.
|
| After all, if its only once or twice, thats a couple of
| dollars, if its millions of times, that hundreds of
| millions
| dragonwriter wrote:
| > Why would that matter
|
| Because its a copyright infringement case, so existence
| and the scale of the infringement is relevant to both
| whether there is liability and, if so, how much; the
| issue isn't that it is possible for infringement to
| occur.
| dragonwriter wrote:
| > Times can show that it is possible
|
| The allegation is not that merely that infringement is
| possible; the actual occurrence and scale are relevant to
| the case.
| mandevil wrote:
| For the most part (there are a few exceptions), in the US
| lawsuits are not based on "possible" harm but actual
| observed harm. To show that, you need actual observed user
| behavior.
| golol wrote:
| So if Amazon sues Google, claiming that it is being
| disadvantaged in search rankings, a court should be able to
| force Google to log all search activity, even when users
| delete it?
| saddist0 wrote:
| It can be just anonymised search history in this case.
| mattnewton wrote:
| That sounds impossible to do well enough without being
| accused of tampering with evidence.
|
| Just erasing the userid isn't enough to actually
| anonymize the data, and if you scrubbed location data and
| entities out of the logs you might have violated the
| court order.
|
| Though it might be in our best interests as a society we
| should probably be honest about the risks of this
| tradeoff; anonymization isn't some magic wand.
| Macha wrote:
| We found that one was a bad idea in the earliest days of
| the web when AOL thought "what could the harm be?" about
| turning over anonymised search queries to researchers.
| dogleash wrote:
| How did you go from a court order to persevere evidence
| and jump to dumping that data raw into the public record?
|
| Courts have been dealing with discovery including secrets
| that litigants never want to go public for longer than
| AOL has existed.
| dragonwriter wrote:
| > It can be just anonymised search history in this case.
|
| Depending on the exact issues in the case, a court might
| allow that (more likely, it would allow only turning over
| anonymized data in discovery, if the issues were such
| that that there was no clear need for more) but generally
| the obligation to preserve evidence does not include the
| right to edit evidence or replace it with reduced-
| information substitutes.
| cogman10 wrote:
| Yes. That's how the US court system works.
|
| Google can (and would) file to keep that data private and
| only the relevant parts would be publicly available.
|
| A core aspect to civil lawsuits is everyone gets to see
| everyone else's data. It's that way to ensure everything is
| on the up and up.
| dragonwriter wrote:
| If Amazon sues Google, a legal obligation to preserve all
| evidence reasonably related to the subject of the suit
| attaches _immediately_ when Google becomes aware of the
| suit, and, yes, if there is a dispute about the extent of
| that obligation and /or Google's actual or planned
| compliance with it, the court can issue an order relating
| to it.
| monetus wrote:
| At Google's scale, what would be the hosting costs of this
| I wonder. Very expensive after a certain point, I would
| guess.
| nobody9999 wrote:
| >At Google's scale, what would be the hosting costs of
| this I wonder. Very expensive after a certain point, I
| would guess.
|
| Which would be chump change[0] compared to the costs of
| an _actual trial_ with multiple lawyers /law firms,
| expert witnesses and the infrastructure to support the
| legal team before, during and after trial.
|
| [0] https://grammarist.com/idiom/chump-change/
| dragonwriter wrote:
| > Courts have always had the power to compel parties to a
| current case to preserve evidence.
|
| Not just that, _even without a specific court order_ parties
| to existing or reasonably anticipated litigation have a legal
| obligation that attaches immediately to preserve evidence.
| Courts tend to issue orders when a party presents reason to
| believe another party is out of compliance with that
| automatic obligation, or when there is a dispute over the
| extent of the obligation. (In this case, both factors seem to
| be in play.)
| btown wrote:
| Lopez v. Apple (2024) seems to be a recent and useful
| example of this; my lay understanding is that Apple was
| found to have failed in its duty to switch from auto-
| deletion (even if that auto-deletion was contractually
| promised to users) to an evidence-preservation level of
| retention, immediately when litigation was filed.
|
| https://codiscovr.com/news/fumiko-lopez-et-al-v-apple-inc/
|
| https://app.ediscoveryassistant.com/case_law/58071-lopez-
| v-a...
|
| Perhaps the larger lesson here is: if you don't want your
| service provider to end up being required to retain your
| private queries, there's really no way to guarantee it, and
| the only real mitigation is to choose a service provider
| who's less likely to be sued!
|
| (Not a lawyer, this is not legal advice.)
| resource_waste wrote:
| >Privacy right is an integral part of the freedom of speech, a
| core human right.
|
| Are these contradictory?
|
| If you overhear a friend gossiping, can't you spread that
| gossip?
|
| Also, where are human rights located? I'll give you a
| microscope.(sorry, I'm a moral anti-realist/expressivist and I
| can't help myself)
| 152132124 wrote:
| I think you will have a better time arguing with a LLM
| mrtksn wrote:
| >Why could a court favor the interest of the New York Times in
| a vague accusation versus the interest and right of hundred
| millions people?
|
| Probably because they bothered to pursue such a thing and
| hundreds of millions people did not.
|
| How do you conclusively know if someone's content generating
| machine infringe with your rights? By saving all of its
| input/output for investigation.
|
| It's ridiculous, sure but is it less ridiculous than AI
| companies claiming that the copyrights shouldn't apply to them
| because it will be bad for their business?
|
| IMHO those are just growth pain. Back in the day people used to
| believe that the law don't apply on them because they did it on
| the internet and they were mostly right because the laws were
| made for another age. Eventually the laws both for criminal
| stuff and copyright caught up. Will be the same for AI, now we
| are in the wild west age of AI.
| TimPC wrote:
| AI companies aren't seriously arguing that copyright
| shouldn't apply to them because "it's bad for business". The
| main argument is that they qualify for fair use because their
| work is transformative which is one of the major criteria for
| fair use. Fair use is the same doctrine that allows a school
| to play a movie for educational purposes without acquiring a
| license for the public performance of that movie. The
| original works don't have model weights and can't answer
| questions or interact with a user so the output is
| substantially different from the input.
| mrtksn wrote:
| Yeah, and the online radio providers argued that they don't
| do anything shady, their service was basically just a very
| long antenna.
|
| Anyway, the laws were not written with this type of
| processing in mind. In fact the whole idea of intellectual
| property breaks down now. Just like the early days of the
| internet.
| AStonesThrow wrote:
| > allows a school to play a movie
|
| No, it doesn't. Play 10% of a movie for the purpose of
| critiquing it, perhaps.
|
| https://fairuse.stanford.edu/overview/fair-use/four-
| factors/
|
| Fair Use is not an _a priori_ exemption or exception; Fair
| Use is an "affirmative defense" so once you have your day
| in court and the judge asks your attorney why you needed to
| play 10% of _Priscilla, Queen of the Desert_ for your
| Gender Studies class, then you can run down those Four
| Factors enumerated by the Stanford article.
|
| Particularly "amount and substantiality".
|
| Teachers and churches get tripped up by this all the time.
| But I've also been blessed with teachers who were very
| careful academically and sought to impart the same caution
| on all students about using copyrighted materials. It is
| not easy when fonts have entered the chat!
|
| The same reason you or your professor cannot show/perform
| 100% of an unlicensed film under any circumstance, is the
| same basis that creators are telling the scrapers that they
| cannot consume 100% of copyrighted works on that end. And
| if the risks may involve reproducing 87% of the same work
| in their outputs, that's beyond the standard thresholds.
| c256 wrote:
| > Fair use is the same doctrine that allows a school to
| play a movie for educational purposes without acquiring a
| license for the public performance of that movie. This is a
| pretty bad example, since fair use has been ruled to NOT
| allow this.
| arcfour wrote:
| What Scrooge sued a school for exhibiting a film for
| educational purposes?!
| kitified wrote:
| Whether a school was actually sued over this is not
| relevant to whether it is legally allowed.
| mandevil wrote:
| It is a bad example, but not for that reason. Instead,
| it's a bad example because Federal copyright law has a
| specific carve out for school educational purposes:
|
| https://www.copyright.gov/title17/92chap1.html#110
| "Notwithstanding the provisions of section 106, the
| following are not infringements of copyright:
|
| (1) performance or display of a work by instructors or
| pupils in the course of face-to-face teaching activities
| of a nonprofit educational institution, in a classroom or
| similar place devoted to instruction, unless, in the case
| of a motion picture or other audiovisual work, the
| performance, or the display of individual images, is
| given by means of a copy that was not lawfully made under
| this title, and that the person responsible for the
| performance knew or had reason to believe was not
| lawfully made;"
|
| That is why it is not a good comparison with the broader
| Fair Use Four Factors test (defined in section 107:
| https://www.copyright.gov/title17/92chap1.html#107)
| because it doesn't need to even get to that analysis, it
| is exempted from copyright.
| no_wizard wrote:
| If AI companies don't want the court headaches they should
| instead preemptively negotiate with rights holders and get
| agreements in place for the sharing of data.
| arcfour wrote:
| Feels like bad faith to say that knowing full well that
|
| 1. This would also be a massive legal headache,
|
| 2. It would become impossibly expensive
|
| 3. We obviously wouldn't have the AI we have today, which
| is an incredible technology (if immature) if this
| happened. Instead the growth of AI would have been
| strangled by rights holders wanting infinity money
| because they know once their data is in that model, they
| aren't getting it back, ever--it's a one-time sale.
|
| I'm of the opinion that AI is and will continue to be a
| net positive for society. So I see this as essentially
| saying "let's go an remove this and delay the development
| of it by 10-20 years and ensure people can't train and
| run their own models feasibly for a lot longer because
| only big companies can afford real training datasets."
| allturtles wrote:
| Why not simply make your counterargument rather than
| accusing GP of being in bad faith? Your argument seems to
| be that it's fine to break the law if the net outcome for
| society is positive. It's not "bad faith" to disagree
| with that.
| arcfour wrote:
| But they didn't break the law. The NYT articles were not
| algorithms/AI.
|
| It's bad faith because they are saying "well, they should
| have done [unreasonable thing]". I explored their version
| of things from my perspective (it's not possible) and
| from a conciliatory perspective (okay, let's say they
| somehow try to navigate that hurdle anyways, is society
| better off? Why do I think it's infeasible?)
| allturtles wrote:
| If they didn't break the law, your pragmatic point about
| outcomes is irrelevant. Open AI is in the clear
| regardless of whether they are doing something great or
| something useless. So I don't honestly know what you're
| trying to say. I'm not sure why getting licenses to IP
| you want to use is unreasonable, it happens all the time.
|
| Edit: Authors Guild, Inc. v. Google, Inc. is a great
| example of a case where a tech giant _tried_ to legally
| get the rights to use a whole bunch of copyrighted
| content (~all books ever published), but failed. The net
| result was they had to completely shut off access to most
| of the Google Books corpus, even though it would have
| been (IMO) a net benefit to society if they had been able
| to do what they wanted.
| bostik wrote:
| > _Your argument seems to be that it 's fine to break the
| law if the net outcome for society is positive._
|
| In any other context, this would be known as "civil
| disobediance". It's generally considered something to
| applaud.
|
| For what it's worth, I haven't made up my mind about the
| current state of AI. I haven't yet seen an ability for
| the systems to perform abstract reasoning, to _actually_
| learn. (Show me an AI that has been fed with nothing but
| examples in languages A and B. Then demonstrate,
| conclusively, that it can apply the lessons it has
| learned in language M, which happens to be nothing like
| the first two.)
| allturtles wrote:
| > In any other context, this would be known as "civil
| disobediance". It's generally considered something to
| applaud.
|
| No, civil disobedience is when you break the law
| expecting to be punished, to force society to confront
| the evil of the law. The point is that you get publicly
| arrested, possibly get beaten, get thrown in jail. This
| is not at all like what Open AI is doing.
| nobody9999 wrote:
| >I'm of the opinion that AI is and will continue to be a
| net positive for society. So I see this as essentially
| saying "let's go an remove this and delay the development
| of it by 10-20 years and ensure people can't train and
| run their own models feasibly for a lot longer because
| only big companies can afford real training datasets."
|
| Absolutely. Which, presumably, means that you're fine
| with the argument that your DNA (and that of each member
| of your family) could provide huge benefits to medicine
| and potentially save millions of lives.
|
| But significant research will be required to make that
| happen. As such, we will be _requiring_ (with no opt outs
| allowed) you and your whole family to provide blood,
| sperm and ova samples weekly until that research pays
| off. You will receive no compensation or other
| considerations other than the knowledge that you 're
| moving the technology forward.
|
| May we assume you're fine with that?
| mandevil wrote:
| https://www.copyright.gov/title17/92chap1.html#110 seems to
| this non-lawyer to be a specific carve out allowing movies
| to be shown, face-to-face, in non-profit educational
| contexts without any sort of license. The Fair Use Four
| Factors test
| (https://www.copyright.gov/title17/92chap1.html#107) isn't
| even necessary in this example.
|
| Absent a special legal carve-out, you need to get judges to
| do the Fair Use Four Factors test, and decide on how AI
| should be treated. To my very much engineer and not legal
| eye, AI does great on point 3, but loses on points 1, 2,
| and 4, so it is something that will need to be decided by
| the judges, how to balance those four factors defined in
| the law.
| freejazz wrote:
| That's not entirely true. A lot of their briefing refers to
| how impractical and expensive it would be to license all
| the content they need for the models.
| rodgerd wrote:
| > AI companies aren't seriously arguing that copyright
| shouldn't apply to them because "it's bad for business".
|
| AI companies have, in fact, said that the law shouldn't
| apply to them or they won't make money. That is literally
| the argument Nick Clegg is using to ague that copyright
| protection should be removed from authors and musicians in
| the UK.
| shkkmo wrote:
| > It's ridiculous, sure but is it less ridiculous than AI
| companies claiming that the copyrights shouldn't apply to
| them because it will be bad for their business?
|
| Since that wasn't ever a real argument, your strawman is
| indeed ridiculous.
|
| The argument is that requiring people to have a special
| license to process text with an algorithm is a dramatic
| expansion of the power of copyright law. Expansions of
| copyright law will inherently advantage large corporate users
| over individuals as we see already happening here.
|
| New York Times thinks that they have the right to spy on the
| entire world to see if anyone might be trying to read
| articles for free.
|
| That is the problem with copyright. That is why copyright
| power needs to be dramatically curtailed, not dramatically
| expanded.
| piombisallow wrote:
| Regardless of the details of this specific case, the courts are
| not democratic and do not decide based on the interest of the
| parties or how many they are, they decide based on the law.
| brookst wrote:
| This is not true even in the slightest.
|
| The law is not a deterministic computer program. It's a
| complex body of overlapping work and the courts are
| specifically chartered to use judgement. That's why briefs
| from two parties in a dispute will often cite different laws
| and precedents.
|
| For instance, Winter v. NRDC specifically says that courts
| must consider whether an injunction is in the public
| interest.
| piombisallow wrote:
| "public interest" is a much more ambiguous thing than the
| written law
| otterley wrote:
| Yes. And, that's why both sides will make their cases to
| the court as to whether the public interest is served by
| an injunction, and then the court will make a decision
| based on who made the best argument.
| DannyBee wrote:
| Lawyer here
|
| First - in the US, privacy is not a constitutional right. It
| should be, but it's not. You are protected against government
| searches, but that's about it. You can claim it's a core human
| right or whatever, but that doesn't make it true, and it's a
| fairly reductionist argument anyway. It has, fwiw, also
| historically _not_ been seen as a core right for thousands of
| years. So i think it 's a harder argument to make than you
| think despite the EU coming around on this. Again, I firmly
| believe it should be a core right, but asserting that it is
| doesn't make that true.
|
| Second, if you want the realistic answer - this judge is
| probably overworked and trying to clear a bunch of simple
| motions off their docket. I think you probably don't realize
| how many motions they probably deal with on a daily basis.
| Imagine trying to get through 145 code reviews a day or
| something like that. In this case, this isn't the trial, it's
| discovery. Not even discovery quite yet, if i read the docket
| right. Preservation orders of this kind are incredibly common
| in discovery, and it's not exactly high stakes most of the
| time. Most of the discovery motions are just parties being a
| pain in the ass to each other deliberately. This normally isn't
| even a thing that is heard in front of a judge directly, the
| judge is usually deciding on the filed papers.
|
| So i'm sure the judge looked at it for a few minutes, thought
| it made sense at the time, and approved it. I doubt they spent
| hours thinking hard about the consequences.
|
| OpenAI has asked to be heard in person on the motion, i'm sure
| the judge will grant it, listen to what they have to say, and
| determine they probably fucked it up, and fix it. That is what
| most judges do in this situation.
| pama wrote:
| Thanks. As an EU citizen am I exempt from this order? How
| does the judge or the NYTimes or OpenAI know that I am an EU
| citizen?
| ElevenLathe wrote:
| The court in question has no obligations to you at all.
| jjani wrote:
| OpenAI does, by virtue of doing business in the EU.
| adgjlsfhk1 wrote:
| you aren't and they don't.
| mananaysiempre wrote:
| The current legal stance in the US seems to be that you,
| not being a US person, have no particular legally protected
| interest in privacy at all, so you have nothing to complain
| about here and can't even sue. The only avenue the EU would
| have to change that is the diplomatic one, but the
| Commission does not seem to care.
| HardCodedBias wrote:
| "First - in the US, privacy is not a constitutional right"
|
| What? The supreme court disagreed with you in Griswold v.
| Connecticut (1965) and Roe v. Wade (1973).
|
| While one could argue that they were _vastly stretching_ the
| meaning of words in these decisions the point stands that at
| this time privacy _is_ a constitutional right in the USA.
| krapp wrote:
| -\\_(tsu)_/- The supreme court overturned Roe v. Wade in
| 2022 and explicitly stated in their ruling that a
| constitutional right to privacy does not exist.
| DannyBee wrote:
| Yes. They went further and explicitly make the
| distinction between the kind of privacy we are talking
| about here ("right to shield information from
| disclosure"), and the kind they saw as protected in
| griswold, lawrence, and roe ("right to make and implement
| important personal decisions without governmental
| interference").
| DannyBee wrote:
| Roe v. wade is considered explicitly overruled, as well as
| considered wrongly decided in the first place, as of 2022
| (Dobbs).
|
| They also explicitly stated a constitutional right to
| privacy does not exist, and pointed out that Casey
| abandoned any such reliance on this sort of claim.
|
| Griswold also found a right to _marital_ privacy. Not
| general privacy.
|
| Griswold is also barely considered good law anymore, though
| i admit it has not been explicitly overruled - it is
| definitely on the chopping block, as more than just Thomas
| has said.
|
| In any case, more importantly, none of them have found any
| interesting right to privacy of the kind we are talking
| about here, but instead more specific rights to privacy in
| certain contexts. Griswold found a right to marital privacy
| in "the penumbra of the bill of rights". Lawrence found a
| right to privacy in your sexual activity.
|
| In dobbs, they explicitly further denied a right to general
| privacy, and argued previous decisions conflated these: "
| As to precedent, citing a broad array of cases, the Court
| found support for a constitutional "right of personal
| privacy." Id., at 152. But Roe conflated the right to
| shield information from disclosure and the right to make
| and implement important personal decisions without
| governmental interference."
|
| You are talking about the former, which none of these cases
| were about. They are all about the latter.
|
| So this is very far afield from a general right to privacy
| of the kind we are talking about, and more importantly, one
| that would cover anything like OpenAI chats.
|
| So basically, you have a ~200 year period where it was not
| considered a right, and then a 50 year period where
| specific forms of privacy were considered a right, and now
| we are just about back to the former.
|
| The kind of privacy we are talking about here ("the right
| to shield information from disclosure") has always been
| subject to a balancing of interests made by legislatures,
| rather than a constitutional right upon which they may not
| infringe. Example abound - you actually don't have to look
| any further than court filings themselves, and when you are
| allowed to proceed anonymously or redact/file things under
| seal. The right to public access is considered much
| stronger than your right to not want the public to know
| embarassing or highly private things about your life. There
| are very few exceptions (minors, etc).
|
| Again, i don't claim any of this is how it is should be.
| But it's definitely how it is.
| HardCodedBias wrote:
| "Dobbs. They also explicitly stated a constitutional
| right to privacy does not exist"
|
| I did not know this, thank you!
| sib wrote:
| I'd like to thank you for explaining this so clearly (and
| for "providing receipts," as the cool kids say).
|
| >> Again, i don't claim any of this is how it is should
| be. But it's definitely how it is.
|
| Agreed.
| shkkmo wrote:
| > It has, fwiw, also historically not been seen as a core
| right for thousands of years. So i think it's a harder
| argument to make than you think despite the EU coming around
| on this.
|
| This doesn't seem true. I'd assume you know more about this
| than I do though so can you explain this in more detail? The
| concept of privacy is definitely more than thousands of years
| old. The concept of a "human right", is arguably much newer.
| Do you have particular evidence that a right to privacy is a
| harder argument to make that other human rights?
|
| While the language differs, the right to privacy is enshrined
| more or less explicitly in many constitutions, including 11
| USA states. It isn't just a "european" thing.
| static_motion wrote:
| I understand what they mean. There's this great video [1]
| which explains it in better terms than I ever could. I've
| timestamped the link because it's quite long, but if you've
| got the time it's a fantastic video with a great narrative
| and presentation.
|
| [1] https://youtu.be/Fzhkwyoe5vI?t=4m9s
| ComposedPattern wrote:
| > It has, fwiw, also historically not been seen as a core
| right for thousands of years.
|
| _Nothing_ has been seen as a core right for thousands of
| years, as the concept of human rights is only a few hundred
| years old.
| tiahura wrote:
| While the Constitution does not explicitly enumerate a "right
| to privacy," the Supreme Court has consistently recognized
| substantive privacy rights through Due Process Clause
| jurisprudence, establishing constitutional protection for
| intimate personal decisions in Griswold v. Connecticut
| (1965), Lawrence v. Texas (2003), and Obergefell v. Hodges
| (2015).
| zerocrates wrote:
| Even in the "protected against government searches" sense
| from the 4th Amendment, that right hardly exists when dealing
| with data you send to a company like OpenAI thanks to the
| third-party doctrine.
| fluidcruft wrote:
| A pretty clear distinction is that all ISPs in the world are
| not currently involved in a lawsuit with New York Times and are
| not accused of deleting evidence. What OpenAI is accused of is
| significantly different from merely agnostically routing
| packets between A and B. OpenAI is not raising astronomical
| funds because they operate as an ISP.
| tailspin2019 wrote:
| > Privacy right is an integral part of the freedom of speech
|
| I completely agree with you, but as a ChatGPT user I have to
| admit my fault in this too.
|
| I have always been annoyed by what I saw as shameless breaches
| of copyright of thousands of authors (and other individuals) in
| the training of these LLMs, and I've been wary of the data
| security/confidentiality of these tools from the start too -
| and not for no reason. Yet I find ChatGPT et al so utterly
| compelling and useful, that I poured my personal data[0] into
| these tools anyway.
|
| I've always felt conflicted about this, but the utility just
| about outweighed my privacy and copyright concerns. So as angry
| as I am about this situation, I also have to accept some of the
| blame too. I knew this (or other leaks or unsanctioned use of
| my data) was possible down the line.
|
| But it's a wake up call. I've done nothing with these tools
| which is even slightly nefarious, but I am today deleting all
| my historical data (not just from ChatGPT[1] but other hosted
| AI tools) and will completely reassess my approach of using
| them - likely with an acceleration of my plans to move to using
| local models as much as I can.
|
| [0] I do heavily redact my data that goes into hosted LLMs, but
| there's still more private data in there about me than I'd
| like.
|
| [1] Which I know is very much a "after the horse has bolted"
| situation...
| CamperBob2 wrote:
| Keeping in mind that the purpose of IP law is to promote
| human progress, it's hard to see how legacy copyright
| interests should win a fight with AI training and
| development.
|
| 100 years from now, nobody will GAF about the New York Times.
| stackskipton wrote:
| IP law was to promote human progress by giving financial
| incentive to create this IP knowing it was protected, and
| you could make money off it.
| CamperBob2 wrote:
| We will all make a lot more money _and_ a lot more
| progress by storing, organizing, presenting, and
| processing knowledge as effectively as possible.
|
| Copyright is not a natural right by any measure; it's
| something we pulled out of our asses a couple hundred
| years ago in response to a need that existed at the time.
| To the extent copyright interferes with progress, as it
| appears to have sworn to do, it has to go.
|
| Sorry. Don't shoot the messenger.
| diputsmonro wrote:
| Why would you expect NYT or any other news organization
| to report accurate data to feed into your AI models if
| they can't make any money off of it?
|
| It's not just about profits, it's about paying reporters
| to do honest work and not cut corners in their reporting
| and data collection.
|
| If you think the data is valuable, then you should be
| prepared to pay the people who collect it, same as you
| pay for the service that collates it (ChatGPT)
| stale2002 wrote:
| But the main point is the human progress here. If there
| is an obvious case where it seriously gets in the way of
| human progress, then thats a problem and I hope we can
| correct it through any means necessary.
| cactusplant7374 wrote:
| > Why could a court favor the interest of the New York Times in
| a vague accusation versus the interest and right of hundred
| millions people?
|
| It simply didn't. ChatGPT hasn't deleted any user data.
|
| > "OpenAI did not 'destroy' any data, and certainly did not
| delete any data in response to litigation events," OpenAI
| argued. "The Order appears to have incorrectly assumed the
| contrary."
|
| It's a bit of a stretch to think a big tech company like
| ChatGPT is deleting users' data.
| huijzer wrote:
| > Why could a court favor the interest of the New York Times in
| a vague accusation versus the interest and right of hundred
| millions people?
|
| Well maybe some people in power have pressured the court into
| this decision? The New York Times surely has some power as well
| via their channels
| dogman144 wrote:
| You raise good points but the target of your support feels
| misplaced. Want private ai? You must self-host and inspect if
| it's phoning home. No way around it in my view.
|
| Otherwise, you are picking your data privacy champions as the
| exact same companies, people and investors that sold us social
| media, and did something quite untoward with the data they got.
| Fool me twice, fool me three times... where is the line?
|
| In other words - OAI has to save logs now? Candidly they
| probably were already, or it's foolish not to assume that.
| jrm4 wrote:
| _Love_ the spirit of what you say and I practice it myself,
| literally.
|
| But also, no - _Just self-host or it 's all your fault_ is
| never ever a sufficient answer to the problem.
|
| It's exactly the same as when Exxon says "what are you doing
| to lower your own carbon footprint?" It's shifting the burden
| unfairly; companies like OpenAI put themselves out there and
| thus must ALWAYS be held to task.
| naming_the_user wrote:
| Anything else is literally impossible, though.
|
| If you send your neighbour nudes then they have your nudes.
| You can put in as many contracts as you want, maybe they
| never digitised it but their friend is over for a drink and
| walks out of the door with the shoebox of film. Do not pass
| GO, do not collect.
|
| Conceivably we can try to control things like e.g. is your
| cellphone microphone on at all times, but once someone
| else, particularly an arbitrary entity (e.g. not a trusted
| family member or something) has the data, it is silly to
| treat it as anything other than gone.
| dogman144 wrote:
| I actually agree with your disagreement, and my answer is
| more scoped to a technical audience that has the know how
| base to deal with it.
|
| I wish it was different and I agree that there's a massive
| accountability hole with... who could it be?
|
| Pragmatically it is what it is, self host and hope for
| bigger picture change.
| lovich wrote:
| Then your problem is with the US legal system, not this
| individual ruling.
|
| You lose your rights to privacy in your papers without a
| warrant once you hand data off to a third party. Nothing in
| this ruling is new.
| dragonwriter wrote:
| > Why could a court favor the interest of the New York Times in
| a vague accusation versus the interest and right of hundred
| millions people?
|
| Because the _law_ favors preservation of evidence for an active
| case above most other interests. It 's not a matter of
| arbitrary preference by the particular court.
| trod1234 wrote:
| It doesn't, it favors longstanding caselaw and laws already on
| the books.
|
| There is a longstanding precedent with regards to business
| document retention, and chat logs have been part of that for
| years if not decades. The article tries to make this sound like
| this is something new, but if you look at the e-retention
| guidelines in various cases over the years this is all pretty
| standard.
|
| For a business to continue operating, they must preserve
| business documents and related ESI upon an appropriate legal
| hold to avoid spoliation. They likely weren't doing this
| claiming the data was deleted, which is why the judge ruled in
| favor against OAI.
|
| This isn't uncommon knowledge either, its required. E-discovery
| and Information Governance are things any business must meet in
| this area; and those documents are subject to discovery in
| certain cases, where OAI likely thought they could avoid it
| maliciously.
|
| The matter here is OAI and its influence rabble are churning
| this trying to do a runaround on longstanding requirements that
| any IT professional in the US would have reiterated from their
| legal department/Information Governance policies.
|
| There's nothing to see here, there's no real story. They were
| supposed to be doing this and didn't, were caught, and the
| order just forces them to do what any other business is
| required to do.
|
| I remember an executive years ago (decades really), asking
| about document retention, ESI, and e-discovery and how they
| could do something (which runs along similar lines to what OAI
| tried as a runaround). I remember the lawyer at the time
| saying, "You've gotta do this or when it goes to court you will
| have an indefensible position as a result of spoliation...".
|
| You are mistaken, and appear to be trying to frame this
| improperly towards a point of no accountability.
|
| I suggest you review the longstanding e-discovery retention
| requirements that courts require of businesses to operate.
|
| This is not new material, nor any different from what's been
| required for a long time now. All your hyperbole about privacy
| is without real basis, they are a company; they must comply
| with law, and it certainly is not outrageous to hold people who
| break the law to account, and this can only occur when
| regulatory requirements are actually fulfilled.
|
| There is no argument here.
|
| References: Federal Rules of Civil Procedure (FRCP) 1, 4, 16,
| 26, 34, 37
|
| There are many law firms who have written extensively on this
| and related subjects. I encourage you to look at those too.
|
| (IANAL) Disclosure: Don't take this as legal advice. I've had
| the opportunity to work with quite a few competent ones, but I
| don't interpret the law; only they can. If you need someone to
| provide legal advice seek out competent qualified counsel.
| rolandog wrote:
| > Why could a court favor the interest of the New York Times in
| a vague accusation versus the interest and right of hundred
| millions people?
|
| Can't you use the same arguments against, say, Copyright
| holders? Billionaires? Corporations doing the Texas two-step
| bankruptcy legal maneuver to prevent liability from allegedly
| poisoning humanity?
|
| I sure hope so.
|
| Edit: ... (up to a point)
| deadbabe wrote:
| OpenAI is a business selling a product, it's not a
| decentralized network of computers contributing spare
| processing power to run massive LLMs. Therefore, you can easily
| point a finger at them and tell them to stop some activity for
| which they are the sole gatekeeper.
| oersted wrote:
| I completely agree with you. But perhaps we should be more
| worried that OpenAI or Google can retain all this data and do
| pretty much what they want with it in the first place, without
| a judge getting into the picture.
| kazinator wrote:
| > Why could a court favor the interest of the New York Times in
| a vague accusation versus the interest and right of hundred
| millions people?
|
| Well, for one thing, a billion plagiarists against one creator
| are no stronger than a single plagiarist, legally speaking.
|
| > _use the Internet for illicit purposes_
|
| But there is no use of an LLM that doesn't involve
| plagiarism/infringement.
|
| There is a concept in law about something being primarily used
| in a certain way, e.g. lockpicking tools, brass knuckles, ...
| kragen wrote:
| Copyright in its current form is incompatible with private
| communication of any kind through computers, because computers by
| their nature make copies of the communication, so it makes any
| private communication through a computer into a potential crime,
| depending on its content. The logic of copyright enforcement,
| therefore, demands access to all such communications in order to
| investigate their legality, much like the Stasi.
|
| Inevitably such a far-reaching state power will be abused for
| prurient purposes, for the sexual titillation of the
| investigators, and to suppress political dissent.
| 6stringmerc wrote:
| This is a ludicrous assertion and factually inaccurate beyond
| all practical intelligence.
|
| A computer in service of an individual absolutely follows
| copyright because the creator is in control of the distribution
| and direction of the content.
|
| Besides, copyright is a civil statute, not criminal. Everything
| about this comment is the most obtuse form of FUD possible. I'm
| pro copyright reform, but this is "Uncle off his meds ranting
| on Facebook" unhinged and shouldn't be given credence
| whatsoever.
| kragen wrote:
| None of that is correct. Some of it is not even wrong,
| demonstrating an unbelievably profound ignorance of its
| topic. Furthermore, it is gratuitously insulting.
| pjc50 wrote:
| > Besides, copyright is a civil statute, not criminal
|
| Nope. https://www.justia.com/intellectual-
| property/copyright/crimi...
| malwrar wrote:
| > A computer in service of an individual absolutely follows
| copyright because the creator is in control of the
| distribution and direction of the content.
|
| I don't understand what means. A computer in service of an
| individual turns copyright law into mattress tag removal law
| --practically unenforceable.
| FrustratedMonky wrote:
| I thought it was a given that they were saving all logs, to in
| turn use for training.
| strogonoff wrote:
| Anyone who seriously cared about their privacy would not be using
| any of the commercial LLM offerings. This is the greatest
| honeypot for profile building the ad tech ever had.
| mritchie712 wrote:
| you're worried about ad targeting?
| bgwalter wrote:
| Ad targeting, user profiling etc. Post Snowden we can
| reasonably assume that the NSA will get an interface.
| strogonoff wrote:
| Goes hand in hand with everything else, like price
| discrimination, including by insurance companies, which in
| all likelihood are not required to disclose that the reason
| your health insurance premiums are up is because you asked
| ChatGPT how to quit smoking.
| bgwalter wrote:
| I've criticized the NYT and paywalls many times myself, but first
| of all OpenAI itself has all your data and we know how "delete"
| functions work in other places.
|
| The Twitter users quoted by Ars Technica, who cite "boomer
| copyright concerns" are pretty short sighted. The NYT and other
| mainstream sources, with all their flaws, provide the historical
| record that pundits can use to discuss issues.
|
| Glenn Greenwald can only point out inconsistencies of the NYT
| because the NYT exists. It is often the starting point for
| discussions.
|
| Some YouTube channels like the Grayzone and Breaking Points send
| reporters directly to press conferences etc. But that is still
| not the norm and important information should not be stored in a
| disorganized YouTube soup.
|
| So papers like the NYT _need_ to survive for democracy to
| function.
| mathgradthrow wrote:
| can we ban slam as a headline word?
| Tteriffic wrote:
| Times is taking a risk. The costs of all this will fall on them,
| if they don't get the judgement they sought at the end of the
| day. Plus OpenAI controls those costs and could drive them up.
| Plus any future litigation by OpenAI users suffering damages due
| to this could arguably be brought against Time years forward.
| It's an odd strategy on their part for evidence that could have
| just been adduced by a statistician (maybe).
| segmondy wrote:
| ... and this is why I run local models. my data, my data.
| HardCodedBias wrote:
| We have really lost the thread WRT our legal system when district
| court judges have such wide ranging power. I understand that
| everything can be appealed, but these judges can cause
| considerable harm.
|
| Ona T. Wang (she/her) ( https://www.linkedin.com/in/ona-t-wang-
| she-her-a1548b3/ ) would have a difficult time getting a job at
| OpenAI but she is given the full force of US law to direct the
| company in almost anyway she sees fit.
|
| The wording is quite explcit, and forceful:
|
| Accordingly, OpenAI is NOW DIRECTED to preserve and segregate all
| output log data that would otherwise be deleted on a going
| forward basis until further order of the Court (in essence, the
| output log data that OpenAI has been destroying), whether such
| data might be deleted at a user's request or because of "numerous
| privacy laws and regulations" that might require OpenAI to do so.
|
| Again, I have no idea how to fix this, but it seems broken.
| b0a04gl wrote:
| llm infra isn't even built for that kind of retention. none of
| it's optimised for long-tail access, audit-safe replay, or scoped
| encryption. feels like the legal system's modelling chat like
| it's email. it's not. it's stateful compute with memory glued to
| a token stream.
| robomartin wrote:
| From the article:
|
| "In the filing, OpenAI alleged that the court rushed the order
| based only on a hunch raised by The New York Times and other news
| plaintiffs. And now, without "any just cause," OpenAI argued, the
| order "continues to prevent OpenAI from respecting its users'
| privacy decisions." That risk extended to users of ChatGPT Free,
| Plus, and Pro, as well as users of OpenAI's application
| programming interface (API), OpenAI said."
|
| This is the consequence and continuation of the dystopian reality
| we have been living for many years. One where a random person,
| influencer, media outlet, politician attacks someone, a company
| or an entity to cause harm and even total destruction (losing
| your job, company boycott, massive loss of income, reputation
| destruction, etc.). This morning, on CNBC, Palantir's CEO
| discussed yet another false accusation made against the company
| by --surprise-- the NY Times, characterizing it as garbage. Prior
| to that was the entirety of the media jumping on Elon Musk
| accusing him of being a Nazi for a gesture used by dozens and
| dozens of politicians and presenters, most recently Corey Booker.
|
| Lies and manipulation. I do think that people are waking up to
| this and massively rejecting professional mass manipulators. We
| now need to take the next step and have them suffer real legal
| consequences for constant lies and, for Picard's sake, also
| address any influence they might have over the courts.
| ryeguy_24 wrote:
| Would Microsoft have to comply with this also? Most enterprise
| users are acquiring LLM services through Microsoft's instance of
| the models in Azure? (i.e. data is not going to Open AI but
| enterprise gets to use Open AI models)
| anbende wrote:
| My (not a lawyer) understanding is "no", because Microsoft is
| not administering the model (making available the chatbot and
| history logging), not retaining chats (typically, unless you
| configure it specifically to do this), and any logs or history
| are only retained on the customer's servers or tenant.
|
| Accessing information on a customer's server or tenant (I have
| been assured) would require a court order for the customer
| directly.
|
| But... as an 365 E5 user with an Azure account using the 4o
| through Foundry... I am much more nervous than I ever have
| been.
| 1vuio0pswjnm7 wrote:
| Perhaps OpenAI should not be collecting sensitive data in the
| first place. Who knows what they are using it for.
|
| Whereas parties to litigation that receive sensitive data are
| subject to limits on how it can be used.
| dsign wrote:
| This is a hard blow for OpenAI; I can see my employer scrambling
| to terminate their contract with them because of this. It could
| be a boom to Mistral.AI though.
| thanatropism wrote:
| Ctrl-F doesn't seem to show anyone remembering the Ballad of
| Scott Alexander.
|
| There's no reasonable narrative in which OpenAI are not villains,
| but NYT is notoriously one to shoot a man in Reno just to see him
| die.
| heisenbit wrote:
| Considering the use to research medical information or to self
| sooth psychological conditions through chats and their
| association with a real person it can get interesting as these
| domains have fairly stringent need to know rules attached with
| criminal liabilities.
| qintl55 wrote:
| Classic example of "courts/lawmakers do NOT understand tech". I
| don't know why this is still as true as it was 10-20 years ago.
| You get such weird rulings that are maybe well-intentioned but
| are so off base on actual impact.
| rimeice wrote:
| I'm curious what you're asking ChatGPT to get verbatim NYT
| articles in the response that you then want to delete? If ChatGPT
| has being doing this, the default is to keep every chat, so I
| doubt NYT lawyers need the small portion of deleted info to find
| evidence if it exists.
|
| Ofcourse if OpenAI was scanning your chat history for verbatim
| NYT text and editing and deleting that would be another thing,
| but that itself would also get noticed.
___________________________________________________________________
(page generated 2025-06-05 23:01 UTC)