Post ATgCgXRmLC9UrC32vY by afamiglietti79@mastodon.social
(DIR) More posts by afamiglietti79@mastodon.social
(DIR) Post #ATgCgVp6NVcVowwe0G by MattHodges@mastodon.social
2023-03-16T16:26:03Z
0 likes, 0 repeats
I asked New #Bing to write a creative conclusion to the song lyric: “hello there the angel from my nightmare”. It’s the proceeded to just emit the actual lyrics to blink-182’s “I Miss You”, claimed it had given me a parody, and then quickly noped out by deleting the response. I’m not sure what content heuristic it tripped… maybe copyright.
(DIR) Post #ATgCgWQg7ng5hUWeZc by johnlray@mastodon.social
2023-03-16T16:46:41Z
0 likes, 0 repeats
@MattHodges Does this suggest one might be able to somewhat circuitously 'opt out' of training data collection for LLMs by liberally sprinkling "Trademarked," "Copyright," "™", etc. on ones' written work?(This post is a Copyright DMCA Warning Copyright-strike strike copyright notice Digital Millennial Copyright Act private ™ ™ ™ ™ ™ trademark notice post by John's Well-Lawyered Post Factory LLC Copyright, TM)
(DIR) Post #ATgCgWzQ2dT1REmOiu by MattHodges@mastodon.social
2023-03-16T16:50:53Z
0 likes, 0 repeats
@johnlray I don't think it opts you out of training (the courts may one day have an actual opinion about that) but it might signal some risk in a model emitting your media it a significantly duplicative fashion?
(DIR) Post #ATgCgXRmLC9UrC32vY by afamiglietti79@mastodon.social
2023-03-16T17:06:16Z
0 likes, 0 repeats
@MattHodges @johnlray I mean, in theory, this sort of direct replication of training data ("memorization") is NOT supposed to happen. If it is happening more frequently than we think, that's a *major* scandal for the tech.
(DIR) Post #ATgCgY6XtclItd7bTE by TedUnderwood@sigmoid.social
2023-03-16T17:13:40Z
0 likes, 0 repeats
@afamiglietti79 @MattHodges @johnlray I think language models that do next-token-prediction are a little different from diffusion models on this. It's well known that if you go "once upon a midnight dreary" all the GPTs will say "while I pondered, weak and weary." As will all of us. That's not a scandal — or not a new one anyway. It's a very specific prompt.
(DIR) Post #ATgCw76w1fSXoHj2YK by afamiglietti79@mastodon.social
2023-03-16T17:16:27Z
0 likes, 0 repeats
@TedUnderwood @MattHodges @johnlray interesting... that does make me wonder a bit what "generalization" means in the context of token prediction...