Post ATgCgXRmLC9UrC32vY by afamiglietti79@mastodon.social
 (DIR) More posts by afamiglietti79@mastodon.social
 (DIR) Post #ATgCgVp6NVcVowwe0G by MattHodges@mastodon.social
       2023-03-16T16:26:03Z
       
       0 likes, 0 repeats
       
       I asked New #Bing to write a creative conclusion to the song lyric: “hello there the angel from my nightmare”. It’s the proceeded to just emit the actual lyrics to blink-182’s “I Miss You”, claimed it had given me a parody, and then quickly noped out by deleting the response. I’m not sure what content heuristic it tripped… maybe copyright.
       
 (DIR) Post #ATgCgWQg7ng5hUWeZc by johnlray@mastodon.social
       2023-03-16T16:46:41Z
       
       0 likes, 0 repeats
       
       @MattHodges Does this suggest one might be able to somewhat circuitously 'opt out' of training data collection for LLMs by liberally sprinkling "Trademarked," "Copyright," "™", etc. on ones' written work?(This post is a Copyright DMCA Warning Copyright-strike strike copyright notice Digital Millennial Copyright Act private ™ ™ ™ ™ ™ trademark notice post by John's Well-Lawyered Post Factory LLC Copyright, TM)
       
 (DIR) Post #ATgCgWzQ2dT1REmOiu by MattHodges@mastodon.social
       2023-03-16T16:50:53Z
       
       0 likes, 0 repeats
       
       @johnlray I don't think it opts you out of training (the courts may one day have an actual opinion about that) but it might signal some risk in a model emitting your media it a significantly duplicative fashion?
       
 (DIR) Post #ATgCgXRmLC9UrC32vY by afamiglietti79@mastodon.social
       2023-03-16T17:06:16Z
       
       0 likes, 0 repeats
       
       @MattHodges @johnlray I mean, in theory, this sort of direct replication of training data ("memorization") is NOT supposed to happen. If it is happening more frequently than we think, that's a *major* scandal for the tech.
       
 (DIR) Post #ATgCgY6XtclItd7bTE by TedUnderwood@sigmoid.social
       2023-03-16T17:13:40Z
       
       0 likes, 0 repeats
       
       @afamiglietti79 @MattHodges @johnlray I think language models that do next-token-prediction are a little different from diffusion models on this. It's well known that if you go "once upon a midnight dreary" all the GPTs will say "while I pondered, weak and weary." As will all of us. That's not a scandal — or not a new one anyway. It's a very specific prompt.
       
 (DIR) Post #ATgCw76w1fSXoHj2YK by afamiglietti79@mastodon.social
       2023-03-16T17:16:27Z
       
       0 likes, 0 repeats
       
       @TedUnderwood @MattHodges @johnlray interesting... that does make me wonder a bit what "generalization" means in the context of token prediction...