[HN Gopher] Tests suggest clues of whose content was used to tra...
___________________________________________________________________
Tests suggest clues of whose content was used to train OpenAI's
Sora
Author : kgwgk
Score : 20 points
Date : 2025-10-01 19:48 UTC (3 hours ago)
(HTM) web link (www.washingtonpost.com)
(TXT) w3m dump (www.washingtonpost.com)
| DroneBetter wrote:
| https://archive.is/ozjEb (note some of the gifs become static
| images here)
| codedokode wrote:
| This vague situation with copyright plays against open-source AI
| models who have to disclose the sources of training data, while
| closed-source companies can freely use pirated material and get
| advantage over open-source models.
| smegma2 wrote:
| I'm normally skeptical of claims like this, but looking at the
| examples it seems that Sora is reproducing some of its training
| data verbatim. I guess it's a case of overfitting? In particular
| the Civ example seems like it must have been copied almost
| verbatim.
___________________________________________________________________
(page generated 2025-10-01 23:02 UTC)