[HN Gopher] Stable-Audio-Demo
       ___________________________________________________________________
        
       Stable-Audio-Demo
        
       Author : beefman
       Score  : 476 points
       Date   : 2024-02-13 03:50 UTC (19 hours ago)
        
 (HTM) web link (stability-ai.github.io)
 (TXT) w3m dump (stability-ai.github.io)
        
       | lbourdages wrote:
       | This is right into the "uncanny valley" of music.
       | 
       | It definitely sounded "like music", but none of it is what a
       | human would produce. There's just something off.
        
         | bane wrote:
         | The overall audio quality sounds pretty good and it seems to do
         | a good job of sustaining a consistent rhythm and musical
         | concept. But I agree there's something "off" about some of the
         | clips.
         | 
         | - The rave music sounds great. But that's because EDM can be
         | quite out there in terms of musical construction.
         | 
         | - The guitar sounds weird because it doesn't sound like chords
         | a human hand can make on a tuning nobody tunes their guitar to
         | - with a strange mix of open and closed strings that don't make
         | sense. I think the restrictions of what a guitar can do aren't
         | well understood by the model.
         | 
         | - The disco chord progression is bizarre. It doesn't sound bad,
         | but it's unlikely to be something somebody working in the genre
         | would choose.
         | 
         | - meditation music - I mean, most of that genre may as well
         | just be some randomized process
         | 
         | - drum solo - there's some weird issues in some of the drum
         | sounds, things like cymbals, rides and hats changing tone in
         | the middle of a note, some of the toms sound weird, it sounds
         | like a mix of stick and brush and stick and stick and brush all
         | at the same time...it's sort of the same problem the solo
         | guitar has where it's just not produced within the constraints
         | of what a drum player can actually do on an instrument made of
         | actual drums
         | 
         | - sound effects, all are pretty good, a little chunky and low
         | bit-rate or low sample-rate sounding, there's probably
         | something going on in the network that's reducing the rate
         | before it gets build back up. There's a constant sort of reverb
         | in all of the examples
         | 
         | I honestly can't say I prefer their model over some of the
         | musicgen output even if their model is doing a better job at
         | following the prompts in some cases.
         | 
         | All of the models have a very low bitrate encoding problems and
         | other weird anomalous things. Some of it reminds me of the
         | output from older mp3 encoders, where hihats and such would get
         | very "swishy" sounding. You can hear some of it in the
         | autoencoder reconstructions, especially the trumpet and the
         | last example.
         | 
         | However, in any case, I'm actually glad in some ways to see the
         | progress being made in this area. It's really impressive. This
         | was complete science fiction only a very few years ago.
        
           | darkwater wrote:
           | > - drum solo - there's some weird issues in some of the drum
           | sounds, things like cymbals, rides and hats changing tone in
           | the middle of a note, some of the toms sound weird, it sounds
           | like a mix of stick and brush and stick and stick and brush
           | all at the same time...it's sort of the same problem the solo
           | guitar has where it's just not produced within the
           | constraints of what a drum player can actually do on an
           | instrument made of actual drums
           | 
           | And I would say that there is also background noise from time
           | to time, at some point I heard some noise akin to voices.
           | Maybe it is some artifact caused by the training data (many
           | drum solos are performed exclusively live).
        
         | otabdeveloper4 wrote:
         | AI pictures are the same. We are more tolerant of six fingered-
         | pictures with missing limbs, for some reason.
        
           | lbourdages wrote:
           | We're used to drawings, 3D renders, etc.
           | 
           | There's no such thing as "artificial music" - at the very
           | least, not since electronic music has become mainstream.
        
         | RobinL wrote:
         | Here is a silly song I generated using suno.ai, which I have
         | found to be incredibly impressive (at least, a small percentage
         | of its outputs are very good, most are bad). I think it's good
         | enough that most humans wouldn't realise it's AI generated.
         | https://app.suno.ai/song/8a64868d-9dd3-46db-91af-f962d4bec8b...
        
           | Agraillo wrote:
           | Very good for my taste, but I should clarify, I'm obsessed
           | with catchy tunes, as a listener and as a hobby musician,
           | growing my own brainworms from time to time. And I must say
           | that suno.ai is very impressive, in my case semi-ready
           | brainworms are almost always in 30%-50% cases. And what's
           | more important, it's really an inspiration tool for all kinds
           | of tasks, like lyrics polishing or playing-along after track
           | separation. Maybe catchy melodies are not for all, but who
           | can argue with charts when The Beatles, ABBA and Queen were
           | almost always producers of ones.
        
           | comex wrote:
           | Wow. I'm guessing it's generating MIDI or something rather
           | than synthesizing audio from scratch? Even so, the quality of
           | the score is leaps and bounds better than any of the long-
           | form audio on the Stable Audio demo page (either Stable Audio
           | itself or the other models). The audio model outputs seem to
           | take a sequence of 1 to 3 chords, add a barebones melody on
           | top, and basically loop this over and over. When they deviate
           | from the pattern, it feels unplanned and chaotic and they
           | often just snap back to the pattern without resolving the
           | idea added by the deviation. (Either that or they completely
           | change course and forget what they were doing before.) Yes,
           | EDM in particular often has repetitive chord structures and
           | basic melodies, but it's not _that_ repetitive. In
           | comparison, from listening to a few suno.ai outputs, they
           | reliably have complex melodies and reasonable chord
           | progressions. They do tend to be repetitive and formulaic,
           | but the repetition comes on a longer time scale and isn't as
           | boring. And they do sometimes get confused and randomly set
           | off in a new direction, but not as often. Most of the time,
           | the outputs sound like real songs. Which is not something I
           | knew AI could do in 2024.
        
             | RobinL wrote:
             | I don't have any special insight into how it works, but I
             | suspect it is largely synthesizing audio from scratch. The
             | more I've thought about it, the task of generating music
             | feels very similar to the task of text-to-speech with
             | realistic intonation. So feels like the same techniques
             | would be applicable.
             | 
             | Suno do have an open source repo here that presumably uses
             | similar tech: https://github.com/suno-ai/bark
             | 
             | > Bark was developed for research purposes. It is not a
             | conventional text-to-speech model but instead a fully
             | generative text-to-audio model, which can deviate in
             | unexpected ways from provided prompts. Suno does not take
             | responsibility for any output generated. Use at your own
             | risk, and please act responsibly.
             | 
             | I've generated probably >200 songs now with Suno, of which
             | perhaps 10 have been any good, and I can't detect any
             | pattern in terms of the outputs.
             | 
             | Here's another one which is pretty good. I accidentally
             | copied and pasted the prompt and lyrics, and it's amazing
             | to me how 'musically' it renders the prompt:
             | 
             | https://app.suno.ai/song/d7bad82b-3018-4936-a06d-8477b400aa
             | e...
             | 
             | Here are a couple more which are pretty good (i use it
             | primarily for making fun songs for my kids):
             | 
             | https://app.suno.ai/song/a308ca8a-9971-47a3-8bb3-a95126ff1a
             | 8...
             | 
             | https://app.suno.ai/song/3b78a631-b52a-4608-a885-94f2edc190
             | b...
             | 
             | And this one's kindof interesting in that it can render
             | 'gregorian chant' (i mean it's not very good): https://app.
             | suno.ai/song/0da7502b-73cf-4106-88e8-26f4f465a5f...
             | 
             | But this is one reason it feels like these models are very
             | similar to text-to-speech but with a different training set
        
             | Agraillo wrote:
             | My understanding is that they use a side effect of the Bark
             | model. The comment
             | https://news.ycombinator.com/item?id=35647569 from
             | JonathanFly probably explains it well. If you train your
             | model on a massive amount of audio mixes of lyrics+music
             | then prompting lyrics alone pulls the music with it as when
             | the comment suggested that prompting context-correlated
             | texts might pull the background noises usual for such
             | context. Already while writing this I imagine training with
             | a huge set of publicly performed poetry pieces that would
             | allow generating novel performances of artificial poets
             | with novel prompts. This is different to riffusion.com
             | approach, where works the genius idea of more or less
             | feeding spectrograms as images to Stable Diffusion.
        
           | urbandw311er wrote:
           | That's impressive. Why do the printed lyrics for the second
           | chorus differ from the audio? (Which repeats those from the
           | first chorus)
        
             | RobinL wrote:
             | I generated the lyrics using ChatGPT 4 and the suno model
             | attempts to follow them.
             | 
             | It generally does a good job, but I have noticed it's
             | fairly common in a second chorus for it to ignore the
             | direction and instead use the same lyrics as the first
             | chorus
        
           | npteljes wrote:
           | That is fantastic. It has a bit of weirdness in the
           | background, but nothing that would stop me from enjoying it.
        
         | dcre wrote:
         | One thing I noticed is that when it's playing chords, it seems
         | a lot more likely than human players to put both major and
         | minor thirds in. This isn't unheard of -- the famous Hendrix
         | chord in "Purple Haze" consists of root, major third, 7th,
         | minor third. But it sounds pretty weird when you do it in every
         | chord.
        
       | ShamelessC wrote:
       | So there aren't public weights, is that right? Having trouble
       | finding anything that says one way or the other.
       | 
       | edit: Oh okay, didn't realize this was somehow a controversial
       | comment to make. It would have been great if you had answered the
       | question before downvoting but that's fine I suppose.
        
         | grey8 wrote:
         | Nope. They did release code for training, inference and fine
         | tuning, but no datasets or weights.
         | 
         | See https://github.com/Stability-AI/stable-audio-tools
        
           | ShamelessC wrote:
           | Thanks!
        
           | turnsout wrote:
           | Wonder if it's an IP issue. They don't want every record
           | label coming after them.
        
             | ShamelessC wrote:
             | Yeah that tracks.
        
       | 8n4vidtmkvmk wrote:
       | The music is pretty meh but the sound effects are exciting for
       | indie game dev!
        
         | nullandvoid wrote:
         | <deleted>
        
         | AuryGlenz wrote:
         | Too bad according to their page you need an enterprise license
         | for even indie games.
        
       | romanzubenko wrote:
       | As with Stable Diffusion, text prompting will be the least
       | controllable way to get useful output with this model. I can
       | easily imagine midi being used as an input with control net to
       | essentially get a neural synthesizer.
        
         | zone411 wrote:
         | Yes. Since working on my AI melodies project
         | (https://www.melodies.ai/) two years ago, I've been saying that
         | producing a high-quality, finalized song from text won't be
         | feasible or even desirable for a while, and it's better to
         | focus on using AI in various aspects of music making that
         | support the artist's process.
        
           | l33tman wrote:
           | Emad hinted here on HN the last time this was discussed that
           | they were experimenting with exactly that. It will come, by
           | them or by someone else quickly.
           | 
           | Text-prompting is just a very coarse tool to quickly get some
           | base to stand on, ControlNet is where the human creativity
           | again enters.
        
             | emadm wrote:
             | Yeah, we build ComfyUI so you can imagine what is coming
             | soon around that.
             | 
             | Need to add more stuff to my Soundcloud
             | https://on.soundcloud.com/XrqNb
        
           | 3cats-in-a-coat wrote:
           | Text will be an important input channel for texture, sound
           | type, voice type and so on. You can't just use input audio,
           | that defeats the point of generating something new. You can't
           | also only use MIDI, it still needs to know what sits behind
           | those notes, what performance, what instrument. So we need
           | multiple channels.
        
         | numpad0 wrote:
         | It's crazy that nobody cares. It seems to me that ML hype
         | trends focus on denying skills and disproving creativity by
         | denoising randoms into what are indistinguishable from human
         | generation, and to me this whole chain of negatives don't seem
         | to have proven its worth.
        
           | JAlexoid wrote:
           | LLMs allow people without certain skills to be creative in
           | forms of art that are inaccessible to them.
           | 
           | With Dalee - I can get an image of something I have in my
           | head, without investing into watching hundreds of hours of
           | Bob Ross(which I do anyway)
           | 
           | With audio generators - I can produce music that is in my
           | head, without learning how to play an instrument or paying
           | someone to do it. I have to arrange it correctly, but I can
           | put out a techno track without spending years in learning the
           | intricacies.
        
         | raincole wrote:
         | For music perhaps. For sound effects I think text prompting is
         | the rather good UI.
        
           | bemmu wrote:
           | Controlnet/img2img style where you can mimic a sound with
           | your mouth and it then makes it realistic could also be
           | usable.
        
         | b0ner_t0ner wrote:
         | But works great when you don't need much control, prompt
         | example: "Free-jazz solo by tenor saxophonist, no time
         | signature."
        
         | gcanko wrote:
         | I think it would be ideal if it could take the audio recording
         | of humming or singing a melody together with a text prompt and
         | spitting out a track that resembles it
        
       | reissbaker wrote:
       | This is incredibly good compared to SOTA music models (MusicGen,
       | MusicLM). It looks like there's also a product page where you can
       | subscribe to use it, similar to Midjourney:
       | https://www.stableaudio.com/
       | 
       | Sadly it's not open-weight and it doesn't look like there's an
       | API (again like Midjourney): you subscribe monthly to generate
       | audio in their UI, rather than having something developers can
       | integrate or wrap.
        
         | ex3ndr wrote:
         | Thankfully you can train it at home, the bigger question is a
         | data.
        
         | nullandvoid wrote:
         | I was hoping to use it to generate some sound effects to use in
         | a game I'm working on - but looks like I need an "enterprise
         | license" (https://www.stableaudio.com/pricing)
         | 
         | Why does this have a different clause I wonder, and doesn't
         | just fall under "In commercial products below 100,000 MAU"?
        
           | emadm wrote:
           | Different deal with the underlying data holders with revenue
           | share etc
        
         | emadm wrote:
         | There is a CC licensed version soon plus API.
         | 
         | Models are advancing very fast, will be quite the year for
         | music.
        
       | TillE wrote:
       | I was briefly excited about the idea of generating sound effects,
       | but those "footsteps" are incredibly bad.
        
         | laborcontract wrote:
         | I tried generating music on stableaudio.com and, yes, it's bad.
         | However, given the blistering pace of developing in these
         | models, I would not be surprised if these sound incredible in a
         | year or two.
        
           | berkes wrote:
           | Everyone every time seems to assume a linear (or exponential)
           | curve upwards.
           | 
           | But what is the proof for that?
           | 
           | I consider it far more likely that we had a breakthrough and
           | now rushing towards the next plateau. Maybe are nearing that.
           | 
           | Like in the curve of a PID controller. It's how most or many
           | human improvements go.
        
             | leodriesch wrote:
             | I'd say most are thinking of Midjourneys success in image
             | generation when talking about this kind of progress.
        
               | berkes wrote:
               | I'm too.
               | 
               | But I still see no evidence that this keeps improving and
               | not plateauing at some (current?) level.
        
             | spacebanana7 wrote:
             | The plateau we're heading for is getting professional human
             | level output from these models with logarithmic progress.
             | 
             | I suspect this is because the underlying production factors
             | like compute, data & model design are steadily improving
             | whilst humans have diminishing sensitivity to output
             | quality.
             | 
             | In the game of AI generated photorealistic images or
             | history essays there's not much improvement left to make.
             | Most humans are already convinced by the output of these
             | things.
        
       | andrewstuart wrote:
       | I felt a great disturbance in the Force, as though all the music
       | licensing lawyers in the USA all cried out at once.
        
         | shon wrote:
         | Perhaps the disturbance you feel is actually the RIAA moving
         | their Death Star into firing range of Stability.ai
        
           | emadm wrote:
           | stableaudio.com is fully licensed, music is an interesting
           | area
           | 
           | https://www.musicbusinessworldwide.com/stability-ai-
           | launches...
        
             | kouteiheika wrote:
             | Serious question, I'd genuinely like to know - why?
             | 
             | You didn't license the images when training Stable
             | Diffusion, and yet you did for Stable Audio? In both cases
             | the training should either be fair use and legal without
             | any licensing, or be infringing and need licensing. Why is
             | audio different than images? Am I missing something here?
        
               | emadm wrote:
               | Law for music is different to other media types
        
       | andbberger wrote:
       | wake me up when it can write a fugue
        
       | alacritas0 wrote:
       | this can produce some pretty disturbing, but interesting music
       | using the prompt "energetic music, violin, voice, orchestra,
       | piano, minimalism, john adams, nixon in china":
       | https://www.stableaudio.com/1/share/953f079e-d704-4138-904c-...
        
         | FergusArgyll wrote:
         | It reminds me a little of breath of the wild guardian music
        
         | seydor wrote:
         | Finally, some music from the future
        
       | MrThoughtful wrote:
       | So many questions ...
       | 
       | They publish the code to train on your own music, but not the
       | weights of their model? So you cannot just upload this thing to
       | some EC2 instance and start creating your own music, correct?
       | 
       | Is this the same as https://www.stableaudio.com?
        
         | alacritas0 wrote:
         | this sounds like progress, but it is still very bad except for
         | highly repetitive music like the EDM examples they give, and
         | even then, it still can't get tempo right
        
         | nextworddev wrote:
         | StabilityAI is just a marketing machine at this point that is
         | praying for an acquisition, since the runway is diminishing
        
       | qwertox wrote:
       | I think we still need the step where the AI learns what a high
       | quality sound library sounds like and then applies the previously
       | learned abilities by triggering sounds of that library via MIDI.
       | 
       | That way you'd get perfect audio quality with the creativity of a
       | musical AI.
        
         | eru wrote:
         | How would MIDI get you eg a guitar being played dirty? Or some
         | subtle echo that comes from recording in a bathroom?
        
           | arrakeen wrote:
           | the AI designs and controls the effects chain and mastering
           | too
        
           | qwertox wrote:
           | It would use a sampler and for the subtle echo effect add a
           | reverb to the bus.
           | 
           | https://www.youtube.com/watch?v=EQdp2QLiSYQ&t=187s
        
           | sebzim4500 wrote:
           | You could have AI do some postprocessing. I think a similaar
           | approach is the future for image generation, you have a model
           | output a 3D scene, use a classical raytracer to do rendering
           | and then have a final model apply corrections to achieve
           | photorealism.
        
         | jchw wrote:
         | I've always wished for something like that for image generation
         | AI. It'd be much cooler/more interesting to watch AI try to
         | draw/paint pictures with strokes rather than just magically
         | iterate into a fully-rendered image. I dunno what kind of
         | dataset or architecture you could possibly apply to accomplish
         | this, but it would be very interesting.
        
           | AuryGlenz wrote:
           | I get what you're saying, but if you watch Stable Diffusion
           | do each step it's at least kind of similar. If you keep the
           | same seed but change a detail, often the broad "strokes" are
           | completely the same.
        
         | 3ds wrote:
         | Isn't that what suno.ai does?
        
       | shon wrote:
       | Interestingly, Ed Newton-Rex, the person hired to build Stable
       | Audio, quit shortly after it was released due to concerns around
       | copyright and the training data being used.
       | 
       | He's since founded https://www.fairlytrained.org/
       | 
       | Reference: https://x.com/ednewtonrex
        
         | az226 wrote:
         | That's an interesting take. But quite the odd stance since he
         | joined Stability and the training of Stable Diffusion was well
         | known.
        
         | doctorpangloss wrote:
         | For generative models, if the model authors do not publish the
         | architecture of their model; and, the model uses a
         | transformation from text to another kind of media; you can
         | assume that they have delegated some part of their model to a
         | text encoder or similar feature which is trained on data that
         | they do not have an express license to.
         | 
         | Even for rightsholders with tens of millions to hundreds of
         | millions of library items like images or audio snippets, the
         | performance of the encoder or similar feature in text-to-X
         | generative models is too poor on the less than billion tokens
         | of text in the large repositories. This includes Adobe's
         | Firefly.
         | 
         | It is also a misconception that large amounts of similar data,
         | like the kinds that appear in these libraries, is especially
         | useful. Without a powerful text encoder, the net result is that
         | most text-to-X models create things that look or sound very
         | average.
         | 
         | The simplest way to dispel such issues is to publish the
         | architecture of the model.
         | 
         | But anyway, even if it were all true, the only reason we are
         | talking about diffusers, and the only reason we are paying
         | attention to this author's work Fairly Trained, is because of
         | someone training on data that was not expressly licensed.
        
           | sillysaurusx wrote:
           | If you require licensing fees for training data, you kill
           | open source ML.
           | 
           | That's why it's important for OpenAI to win the upcoming
           | court cases.
           | 
           | If they lose, they'll survive. But it will be the end of open
           | model releases.
           | 
           | To be clear, I don't like the idea of companies profiting off
           | of people's work. I just like open source dying even less.
        
             | sillysaurusx wrote:
             | Replying to a deleted comment:
             | 
             | > It sounds as if you imply that would be bad. But what if
             | it wasn't?
             | 
             | Entirely possible. The early history of aviation was open
             | source in the sense that many unlicensed people
             | participated, and died. The world is strictly better with
             | licensing requirements in place for that field.
             | 
             | But no one knows. And if history is any guide for software,
             | it seems better to err on freedoms that happen to have some
             | downside rather then clamping down on them. One could
             | imagine a world where BitTorrent was illegal. Or
             | cryptography, or bitcoin.
        
               | raverbashing wrote:
               | Are you really comparing licensing for a profession with
               | licensing of IP?
        
               | sillysaurusx wrote:
               | It's much the same. Only authorized people are allowed to
               | do X. Since X costs a lot of money, by definition it
               | can't be open source. There are no hobbyist pilots that
               | carry passengers without a license, and if there are,
               | they're quickly told to stop. Generative AI faces a real
               | chance of having the same fate. Which means open source
               | will look similar to these planes trying to compete with
               | commercial aircraft: https://pilotinstitute.com/flying-
               | without-a-license/
               | 
               | If you can think of a better example, I'd like to know
               | though. I'll use it in future discussions. It's hard to
               | think of good analogies when the tech has new social
               | effects.
        
               | PeterStuer wrote:
               | If I fly a plane and crash, my passengers die. If I
               | generate an image using a model whose training included
               | some unlicensed imagery... Disney misses out on a
               | fraction of a cent?
               | 
               | There is a real reason why some professions are licenced
               | and others are not.
               | 
               | Your analogy is nonsensical. Not having a better one is
               | irrelevant.
        
               | sillysaurusx wrote:
               | If training data requires licensing fees, ML
               | practitioners will become a licensed field de facto,
               | because no one in the open source world will have the
               | resources to pursue it on their own.
               | 
               | Perhaps a better analogy is movies. At least with acting,
               | you can make your own movies, even if you're on a
               | shoestring budget. With ML, you quite literally can't
               | make a useful model. There's not enough uncopyrighted
               | data to do anything remotely close to commercial models,
               | even in spirit.
        
               | avisser wrote:
               | > If training data requires licensing fees, ML
               | practitioners will become a licensed field de facto,
               | 
               | You know the word "license" has multiple, dissimilar
               | meanings, right?
        
             | deely3 wrote:
             | > If you require licensing fees for training data, you kill
             | open source ML.
             | 
             | kill open source ML -> decrease speed of improvements for
             | some open source ML
        
               | sillysaurusx wrote:
               | Sadly not. Making something illegal has social effects,
               | not just legal effects. I've grown tired of being
               | verbally spit on for books3. One lovely fellow even said
               | that he hoped my daughter grows up resenting me for it.
               | 
               | It being legal is the only guard against that kind of
               | thing. People will still be angry, but they won't be so
               | numerous. Right now everyone outside of AI almost
               | universally despises the way AI is trained.
               | 
               | Which means you won't be able to say that you do open
               | source ML without risking your job. People will be angry
               | enough to try to get you fired for it.
               | 
               | (If that sounds extreme, count yourself lucky that you
               | haven't tried to assemble any ML datasets and release
               | them. The LAION folks are in the crosshairs for
               | supposedly including CSAM in their dataset, and they're
               | not even a dataset, just an index.)
        
               | multjoy wrote:
               | If everyone is unhappy with your rampant piracy, then
               | perhaps that is a sign that you're doing it wrong?
        
               | sillysaurusx wrote:
               | Perhaps. The reason I did it was because OpenAI was doing
               | it, and it's important for open source to be able to
               | compete with ChatGPT. But if OpenAI's actions are ruled
               | illegal, then empirically open source wasn't a persuasive
               | enough reason to allow it.
        
               | 4bpp wrote:
               | Is there evidence that it's actually everyone or even
               | close to everyone? The core innovation that the internet
               | brought to harassment is that it is sufficient for some
               | 0.0...01% of all people to take issue with you and be
               | sufficiently dedicated to it for every waking minute of
               | your life to be filled with a non-stop torrent of
               | vitriol, as a tiny percentage of all internet users still
               | amounts to thousands.
        
               | viraptor wrote:
               | US copyright has limited reach. There are models trained
               | in China, where the IP rules are... not really enforced.
               | It would be an interesting world where you use / pay for
               | those models because you can't train them locally.
        
               | JAlexoid wrote:
               | > Right now everyone outside of AI almost universally
               | despises the way AI is trained.
               | 
               | I don't agree with this. Most people don't care at all,
               | and at best people would argue about some form of
               | compensation.
               | 
               | Saying "everyone" is unsubstantiated.
               | 
               | I mean... "Everyone was angry at Napster" at the same
               | time "everyone is angry at the MPAA/RIAA"
        
             | marcyb5st wrote:
             | Is there a license that states: if you use this data for ML
             | training you must open source model weights and
             | architecture?
        
               | sillysaurusx wrote:
               | It's deeper than that. The basis of licensing is
               | copyright. If the upcoming court cases rule in OpenAI's
               | favor, you won't be able to apply copyright to training
               | data. Which means you can't license it.
               | 
               | Or rather, you can, but everyone is free to ignore you. A
               | license without teeth is no license at all. The GPL is
               | only relevant because it's enforceable in court.
               | 
               | I'm sure some countries will try the licensing route
               | though, so perhaps there you'd be able to make one.
               | 
               | EDIT: I misread you, sorry. You're saying that if OpenAI
               | loses and license fees become the norm, maybe people will
               | be willing to let their data be used for open source
               | models, and a license could be crafted to that effect.
               | 
               | Probably, yes. But the question is whether there's enough
               | training data to compete with the big companies that can
               | afford to license much more. I'm doubtful, but it could
               | be worth a try.
        
               | JAlexoid wrote:
               | >The GPL is only relevant because it's enforceable in
               | court.
               | 
               | The irony of GPL, is that it's validity with respect to
               | users is only now tested in court.
               | 
               | https://www.dlapiper.com/en/insights/publications/2024/01
               | /sf...
        
             | JoshTriplett wrote:
             | > If you require licensing fees for training data, you kill
             | open source ML.
             | 
             | And likely proprietary ML as well, hopefully.
             | 
             | (To be clear, I think AI is an absolutely incredible
             | innovation, capable of both good and harm; I also think
             | it's not unreasonable to expect it to play a safer, slower
             | strategy than the Uber "break the rules to grow fast until
             | they catch up to you" playbook.)
             | 
             | I'm all for eliminating copyright. Until that happens, I'm
             | utterly opposed to AI getting a special pass to ignore it
             | while everyone else cannot.
             | 
             | Fair use was intended for things like reviews, commentary,
             | education, remixing, non-commercial use, and many other
             | things; that doesn't make it appropriate for "slurp in the
             | entire Internet and make billions remixing all of it at
             | once". The commercial value of AI _should_ utterly break
             | the four-factor test.
             | 
             | Here's the four-factor test, as applied to AI:
             | 
             | "What is the character of the use?" - Commercial
             | 
             | "What is the nature of the work to be used?" - Anything and
             | everything
             | 
             | "How much of the work will you use?" - All of it
             | 
             | "If this kind of use were widespread, what effect would it
             | have on the market for the original or for permissions?" -
             | Directly competes with the original, killing or devaluing
             | large parts of it
             | 
             | Literally every part of the four-factor test is maximally
             | against this being fair use. (Open Source AI fails three of
             | four factors, and then many _users_ of the resulting AI
             | fail the first factor as well.)
             | 
             | > If they lose, they'll survive.
             | 
             | That seems like an open question. If they lose _these_
             | court cases, setting a precedent, then there will be ten
             | thousand more on the heels of those, and it seems
             | questionable whether they 'd survive _those_.
             | 
             | > To be clear, I don't like the idea of companies profiting
             | off of people's work. I just like open source dying even
             | less.
             | 
             | You're positioning these as opposed because you're focused
             | on the case of Open Source AI. There are a massive number
             | of Open Source _projects_ whose code is being trained on,
             | producing AIs that launder the copyrights of those projects
             | and ignore their licenses. I don 't want Open Source
             | projects serving as the training data for AIs that ignore
             | their license.
        
               | viraptor wrote:
               | > "How much of the work will you use?" - All of it
               | 
               | That depends on the interpretation of "use", and it would
               | be interesting to read what lawyers think. You learned
               | the language largely from speech and copyrighted works.
               | (All the stories, books, movies, etc. you ever
               | read/heard) When you wrote this comment did you use all
               | of them for that purpose? Is the case of AI different?
               | 
               | To be clear that's a rhetorical question - I don't expect
               | anyone here to actually have a convincing enough argument
               | either way.
        
               | JoshTriplett wrote:
               | Principles applied to human brains are not automatically
               | applicable to AI training. To the best of my knowledge,
               | there's no particular law that says a human brain is
               | exempt from copyright, but it empirically is, because the
               | alternative would be utterly unreasonable. No such
               | exemption exists for AI training, nor should it.
               | 
               | Ideas/works/etc _literally_ live rent-free in your head.
               | That doesn 't mean they should live rent-free in an AI's
               | neural network.
               | 
               | Changing that should involve _actually reducing or
               | eliminating copyright_ , for everyone, not giving a
               | special pass to AI.
        
               | JAlexoid wrote:
               | > To the best of my knowledge, there's no particular law
               | that says a human brain is exempt from copyright, but it
               | empirically is, because the alternative would be utterly
               | unreasonable.
               | 
               | Human brain most definitely is not exempt. If you read
               | Lord of the Rings and then write down a new book, with
               | the same characters and same story line - that's plain
               | copying(lookup the etymology of the verb to copy). If you
               | look at a painting and paint a very similar painting -
               | that's still copying.
               | 
               | Human brains are the reason we have copyright. Your
               | recital of passages from any copyrighted book would
               | violate the copyright, if not for fair use doctrine. And
               | it has nothing to do with whether you do it yourself, or
               | have a TTS engine produce the sound.
        
               | JoshTriplett wrote:
               | The human brain is absolutely exempt, insofar as the copy
               | _stored in your brain_ does not make your brain subject
               | to copyright, even if a subsequent work you produce might
               | be. Nobody 's filing copyright infringement claims over
               | people's memories in and of themselves.
               | 
               | I'm saying that AI does not and should not automatically
               | get the exception that a human brain does.
        
               | sillysaurusx wrote:
               | It's not so clear cut. Many lawyers believe all that
               | matters is whether the _output_ of the model is
               | infringing. As much as people love to cite ChatGPT
               | spitting out code that violates copyright, the vast
               | majority of the outputs do not. Those that do, are
               | quickly clamped down on -- you'll find it hard to get
               | Dalle to generate an image of anything Nintendo related,
               | unless you're using crafty language.
               | 
               | There's also the moral question. Should creators have the
               | right to prevent their bits from being copied at all?
               | Fundamentally, people are upset that their work is being
               | used. But "used" in this case means "copied, then
               | transformed." There's precedent for such copying and
               | transformation. Fair use is only one example. You're
               | allowed to buy someone's book and tear it up; that copy
               | is yours. You can also download an image and turn it into
               | a meme. That's something that isn't banned either. The
               | question hinges on whether ML is quantitatively
               | different, not qualitatively different. Scale matters,
               | and it's a difference of opinion whether the scale in
               | this case is enough to justify banning people from
               | training on art and source code. The courts' opinion will
               | have the final say.
               | 
               | The thing is, I basically agree with you in terms of what
               | you want to happen. Unfortunately the most likely outcome
               | is a world where no one except billion dollar
               | corporations can afford to pay the fees to create useful
               | ML models. Are you sure it's a good outcome? The chance
               | that OpenAI will die from lawsuits seems close to nil.
               | Open source AI, on the other hand, will be the first on
               | the chopping block.
        
               | bryanrasmussen wrote:
               | >Those that do, are quickly clamped down on -- you'll
               | find it hard to get Dalle to generate an image of
               | anything Nintendo related, unless you're using crafty
               | language.
               | 
               | really it seems more like someone was afraid of angering
               | Nintendo who is a corporate adversary one does not like
               | to fight and thus it has a bunch of blocks to keep from
               | generating anything that offends Nintendo, that does not
               | really translate to quickly and easily stopping and
               | blocking offending generations across every copyrighted
               | work in the world.
        
               | raphman wrote:
               | > Many lawyers believe all that matters is whether the
               | output of the model is infringing.
               | 
               | What I don't understand (as a European with little
               | knowledge of court decisions on fair use): with the same
               | reasoning you might make software piracy a case of 'fair
               | use', no? You take stuff someone else wrote - without
               | their consent - and use it to create something new. The
               | output (e.g. the artwork you create with Photoshop) is
               | definitely not copyrighted by the manufacturer of the
               | software. But in the case of software piracy, it is not
               | about the output. With software, it seems clear that the
               | act of taking something you do not have the rights for
               | and using it for personal (financial) gain is not covered
               | by fair use.
               | 
               | Why can OpenAI steal copyrighted content to create
               | transformative works but I cannot steal Photoshop to
               | create transformative works? What am I missing?
        
               | JAlexoid wrote:
               | > make software piracy a case of 'fair use'
               | 
               | That's not a good example. Making a copy of a record you
               | own(as an example ripping a audio CD to MP3) is
               | absolutely fair use. Giving your video game to your
               | neighbor to play - that's also fair use.
               | 
               | Fair use is limited when it comes to
               | transformative/derivative work. Similar laws are in place
               | all over the world, just in US some of those come from
               | case law.
               | 
               | > With software, it seems clear that the act of taking
               | something you do not have the rights for and using it for
               | personal (financial) gain is not covered by fair use.
               | 
               | > Why can OpenAI steal copyrighted content to create
               | transformative works but I cannot steal Photoshop to
               | create transformative works?
               | 
               | That's not a good analogy. The argument, that is not
               | settled yet, is that a model doesn't contain enough
               | copyrightable material to produce an infringing output.
               | 
               | Take your software example - you legally acquire Civ6,
               | you play Civ6, you learn the concepts and the visuals of
               | Civ6... then you take that knowledge and create a game
               | that is similar to Civ6. If you're a copyright maximalist
               | - then you would say that creating any games that mimic
               | Civ6 by people who have played Civ6 is copyright
               | infringement. Legally there are definitely lower limits
               | to copyright - like no one owns the copyright to the
               | phrase "Once upon a time", but there may be a copyright
               | on "In a galaxy far far away".
        
               | Ukv wrote:
               | > Why can OpenAI steal copyrighted content to create
               | transformative works but I cannot steal Photoshop to
               | create transformative works? What am I missing?
               | 
               | If Photoshop was hosted online by Adobe, you would be
               | free to do so. It's copyrighted, but you'd have an
               | implied license to use it by the fact it's being made
               | available to you to download. Same reason search engines
               | can save and present cached snapshots of a website (Field
               | v. Google).
               | 
               | In other situations (e.g: downloading from an unofficial
               | source) you're right that private copying is (in the US)
               | still prima facie copyright infringement. However, when
               | considering a fair use defense, courts do take the
               | distinction into strong consideration: "verbatim
               | intermediate copying has consistently been upheld as fair
               | use if the copy is 'not reveal[ed] . . . to the public.'"
               | (Authors Guild v. Google)
               | 
               | If you were using Photoshop in some transformative way
               | that gives it new purpose (e.g: documenting the evolution
               | of software UIs, rather than just making a photo with it
               | as designed) then you may* be able to get away with
               | downloading it from unofficial sources via a fair use
               | defense.
               | 
               | *: (this is not legal advice)
        
               | dannyobrien wrote:
               | So, fair use is seen as a balance, and generally the
               | balance is thought of as being codified under four
               | factors:
               | 
               | https://www.copyright.gov/title17/92chap1.html#107
               | 
               | (1) the purpose and character of the use, including
               | whether such use is of a commercial nature or is for
               | nonprofit educational purposes;
               | 
               | (2) the nature of the copyrighted work;
               | 
               | (3) the amount and substantiality of the portion used in
               | relation to the copyrighted work as a whole; and
               | 
               | (4) the effect of the use upon the potential market for
               | or value of the copyrighted work.
               | 
               | There's more detailed discussion here:
               | https://copyright.columbia.edu/basics/fair-use.html
        
               | navjack27 wrote:
               | Dalle on Bing is happy to generate Mario and Luigi and
               | Sonic and basically everybody from everybody without
               | using crafty language so I'm unsure of what you're
               | talking about.
        
               | JAlexoid wrote:
               | It would be interesting to see if courts agree that
               | training+transforming = copying.
               | 
               | If I paint a picture inspired by Starry Night(Van Gogh) -
               | does that inherently infringe on the original? I looked
               | at that painting, learned the characteristics, looked at
               | other similar paintings and painted my own. I basically
               | trained my brain. (and I mean the copyright, not the
               | individual physical painting)
               | 
               | And I mean cases where I am not intentionally trying to
               | recreate the original, but doing a derivative(aka
               | inspired) work.
               | 
               | Because it's already settled that recreating the original
               | from memory will infringe on copyright.
        
               | magicalhippo wrote:
               | > "What is the character of the use?" - Commercial
               | 
               | Your first factor seems to not at all be like that which
               | Stanford has in its guidelines[1], which they call the
               | transformative factor:
               | 
               |  _In a 1994 case, the Supreme Court emphasized this first
               | factor as being an important indicator of fair use. At
               | issue is whether the material has been used to help
               | create something new or merely copied verbatim into
               | another work._
               | 
               | LLMs mostly create something new, but sometimes seems to
               | be able to regurgitate passages verbatim, so I can see
               | arguments for and against, but to my untrained eyes
               | doesn't seem as clear cut.
               | 
               | [1]: https://fairuse.stanford.edu/overview/fair-use/four-
               | factors/
        
               | andybak wrote:
               | Bear with me here. Rushed and poorly articulated post
               | incoming...
               | 
               | In the broadest sense, generative AI helps achieve the
               | same goals that copyleft licences aim for. A future where
               | software isn't locked away in proprietary blobs and users
               | are empowered to create, combine and modify software that
               | they use.
               | 
               | Copyleft uses IP law against itself to push people to
               | share their work. Generative AI aims to assist in writing
               | (or generating) code and make sharing less neccesary.
               | 
               | I argue that if you are a strong believer in the ultimate
               | goals of copyleft licences you should also be supporting
               | the legality of training on open source code.
        
               | TheOtherHobbes wrote:
               | The obvious difference is that copyleft is voluntary,
               | while having your art style stolen isn't.
               | 
               | If an artist approached a software developer, created a
               | painting of them using their Mac, and said "There, I've
               | done your job for you" you'd think they were an idiot.
               | 
               | This is the same from the other side. The inability to
               | understand why that's a realistic analogy does not change
               | the fact that it is.
        
               | llm_trw wrote:
               | > The obvious difference is that copyleft is voluntary,
               | while having your art style stolen isn't.
               | 
               | What a curious type of theft where the author keeps their
               | art and I get different art.
        
               | webmaven wrote:
               | "> The obvious difference is that copyleft is voluntary,
               | while having your art style stolen isn't."
               | 
               | This is why it is important whether you consider that
               | infringement occurs upon ingestion or output. If it only
               | matters for outputs, then artists have a problem, since
               | copyright doesn't protect styles at all, see for example
               | the entire fashion industry.
               | 
               | There is a saving grace though: Artists can make a case
               | that the association of their distinctive style with
               | their name is at least potentially a violation of
               | trademark or trade dress, especially if that association
               | is being used to promote the outputs to the public. This
               | is a fairly clear case of commercial substitution in the
               | market for creating new works in that artist's style and
               | creating confusion concerning the origin of the resulting
               | work.
               | 
               | Note that the market for creating new works in a
               | particular artist's distinctive and named style kind of
               | goes away upon the artist's passing. What remains is the
               | trademark issue of whether a particular work was actually
               | created by the artist or not, which existing trademark
               | law is well suited to policing, as long as the trademark
               | is defended, even past the expiration of the copyright.
               | 
               | Meanwhile, trademark (and copyright) also apply to the
               | _subjects_ of works, like Nintendo 's Mario or Disney's
               | Mickey Mouse or Marvel's Iron Man. But we don't really
               | want models to simply be forbidden from producing them as
               | outputs, or they become useless as tools for the purpose
               | of parody and satire, not to mention the ability to
               | create non-commercial fan art. The potential liability
               | for violating these trademarks by _publishing_ works
               | featuring those characters rests with the users rather
               | than the tools, though, and again existing law is fairly
               | well suited to policing the market. Similarly,
               | celebrities ' right of publicity probably shouldn't
               | prevent models from learning what they look like or from
               | making images that include their likeness when prompted
               | with their name, but users better be prepared to justify
               | publishing those results if sued.
               | 
               | You can also make the (technical) argument that if you
               | just ask for an image of Wonder Woman, and you get an
               | image that looks like Gal Gadot as Wonder Woman, that the
               | model is overfitting. That's also the issue with the
               | recent spate of coverage of Midjourney producing near-
               | verbatim screenshots from movies.
               | 
               | It might be appropriate though to regulate commercial
               | generative AI services to the extent of requiring them to
               | warn users of all the potential copyright/trademark/etc.
               | violations, if they ask for images of Taylor Swift as
               | Elsa, or Princess Peach, or Wonder Woman, for example.
        
               | kmeisthax wrote:
               | The majority of AI models out there (at least by
               | popularity / capability) are proprietary; with weights
               | and even model architectures that are treated as trade
               | secret. Instead of having human-written music and movies
               | that you legally can't copy, but practically can; you now
               | have slop-generating models that live on a cloud server
               | you have no control over. Artists and programmers who
               | want to actually publish something - copyright or no -
               | now have to compete with AI spam on search engines, while
               | ChatGPT gets to merely be "confidently wrong" because it
               | was built on the Internet equivalent of low-background
               | metal - pre-AI training data. Generative AI is not a road
               | that leads to less intellectual property[0], it's just an
               | argument for reappropriating it to whoever has the
               | fastest GPUs.
               | 
               | This is contrary to the goals of the Free Software
               | movement - and also why Free Software people were the
               | first to complain about all the copying going on. One of
               | the things Generative AI is really good at is plagiarism
               | - i.e. taking someone else's work and "rewriting it" in
               | different words. If that's fair use, then copyleft is
               | functionally useless.
               | 
               | It's important to keep in mind the difference between
               | violating the letter of the law and opposing the business
               | interests of the people who wrote the law. Copyleft and
               | share-alike clauses have the intention of getting in the
               | way of copyright as an institution, but it also relies on
               | copyright to work, which is why the clauses have power
               | even though they violate the spirit of copyright.
               | Generative AI might violate the letter of the law, but
               | it's very much in the spirit of what the law wants.
               | 
               | [0] Cory Doctorow: "Intellectual property is any law that
               | allows you to dictate the conduct of your competitors"
        
               | CaptainFever wrote:
               | Is FSF's stance on AI actually clear? I thought they were
               | just upset it was made by Microsoft.
               | 
               | Creative Commons has been fairly pro-AI -- they have been
               | quite balanced, actually, but they do say that opt-in is
               | not acceptable, it should be opt-out at most. EFF is
               | fairly pro AI too -- at least, against using copyright to
               | legislate against it.
               | 
               | You shouldn't discount progress in the open model
               | ecosystem. You can sort of pirate ChatGPT by fine tuning
               | on its responses, there's GPU sharing initiatives like
               | Stable Horde, there's TabbyML which works very well
               | nowadays, and Stable Diffusion is still the most advanced
               | way of generating images. There's very much of an anti-IP
               | spirit going on there, which is a good thing -- it's what
               | copyleft is there for in sprit, isn't it?
        
               | Ukv wrote:
               | > Fair use was intended for things like reviews,
               | commentary, education, remixing, non-commercial use, and
               | many other things
               | 
               | "many other things" has included, for example, Google
               | Books scanning millions of in-copyright books, storing
               | internally them in full, and making snippets available.
               | 
               | The basis for copyright itself is to "promote the
               | progress of science and useful arts". For that reason a
               | key consideration of fair use, which you've skipped
               | entirely, is the transformative nature of the new work.
               | As in Campbell v. Acuff-Rose Music: "The more
               | transformative the new work, the less will be the
               | significance of other factors", defined as "whether the
               | new work merely 'supersede[s] the objects' of the
               | original creation [...] or instead adds something new".
               | 
               | > "How much of the work will you use?" - All of it
               | 
               | For the substantiality factor, courts make the
               | distinction between intermediate copying and what is
               | ultimately made available to the public. As in Sega v.
               | Accolade: "Accolade, a commercial competitor of Sega,
               | engaged in wholesale copying of Sega's copyrighted code
               | as a preliminary step in the development of a competing
               | product" yet "where the ultimate (as opposed to direct)
               | use is as limited as it was here, the factor is of very
               | little weight". Or as in Authors Guild v. Google:
               | "verbatim intermediate copying has consistently been
               | upheld as fair use if the copy is 'not reveal[ed] . . .
               | to the public.'"
               | 
               | The factor also takes into account whether the copying
               | was necessary for the purpose. As in Kelly v. Arriba
               | Soft: "If the secondary user only copies as much as is
               | necessary for his or her intended use, then this factor
               | will not weigh against him or her"
               | 
               | While there are still cases of overfitting resulting in
               | generated outputs overly similar to training data, I
               | think it's more favorable to AI than simply "it trained
               | on everything, so this factor is maximally against fair
               | use".
               | 
               | > Directly competes with the original, killing or
               | devaluing large parts of it
               | 
               | The factor is specifically the effect of the use upon the
               | work - not the extent to which your work would be
               | devalued even if it had not been trained on your work.
        
               | TheOtherHobbes wrote:
               | None of those arguments make sense. The output of AI
               | absolutely does supersede the objects of the original
               | creation. If it didn't, artists wouldn't care that they
               | were no longer able to make a living.
               | 
               | Substantiality of code does not apply to substantiality
               | of style. What's being copied is _look and feel_ , which
               | is very much protected by copyright.
               | 
               | The copying clearly is necessary for the purpose. No
               | copying, no model. The fact that the copying is then
               | compressed after ingestion doesn't change the fact that
               | it's necessary for the modelling process.
               | 
               | Last point - see first point.
               | 
               | IANAL, but if I was a lawyer I'd be referring back to
               | look and feel cases. It's the _essence_ of an artist 's
               | look and feel that's being duplicated and used for
               | commercial gain without a license.
               | 
               | That's true whether it's one artist - which it can be,
               | with added training - or thousands.
               | 
               | Essentially what MJ etc do is curate a library of looks
               | and feels, and charge money for access.
               | 
               | It's a little more subtle than copying fixed objects, but
               | the principle remains the same - original work is being
               | copied and resold.
        
               | Ukv wrote:
               | > None of those arguments make sense. The output of AI
               | absolutely does supersede the objects of the original
               | creation. If it didn't, artists wouldn't care that they
               | were no longer able to make a living.
               | 
               | The question for transformative nature is whether it
               | _merely_ supersedes or instead adds something new. E.G:
               | Google translate was trained on books /documents
               | translated by human translators and may in part displace
               | that need, but adds new value in on-demand translation of
               | arbitrary text - which the static works it was trained on
               | did not provide.
               | 
               | > Substantiality of code does not apply to substantiality
               | of style.
               | 
               | I'm not certain what you're saying here.
               | 
               | > The copying clearly is necessary for the purpose. No
               | copying, no model.
               | 
               | Which, for the substantiality factor, works in favor of
               | the model developers.
               | 
               | > It's the essence of an artist's look and feel that's
               | being duplicated and used for commercial gain without a
               | license.
               | 
               | Copyright protects works fixed in a tangible medium, not
               | ideas in someone's head. It would protect _a work 's_
               | look/appearance (which can be an issue for AI when
               | overfitting causes outputs that are substantially similar
               | to a protected work), but not style or " _an artist 's_
               | look and feel".
        
               | JAlexoid wrote:
               | > What's being copied is _look and feel_, which is very
               | much protected by copyright.
               | 
               | If that were the case, no one would be able to paint any
               | cubist paintings. (Picasso estate would own the
               | copyright, to this day)
               | 
               | It's not that clear cut, there are a lot of nuances.
        
               | UncleEntity wrote:
               | Ironically, Picasso was notorious for copying other
               | artist's 'look and feel'...
        
               | JoshTriplett wrote:
               | > "many other things" has included, for example, Google
               | Books scanning millions of in-copyright books, storing
               | internally them in full, and making snippets available.
               | 
               | That succeeds on a different part of the four-factor
               | test, the degree to which it competes with / affects the
               | market for the original.
               | 
               | Google Books is not automatically producing new books
               | derived from their copies that compete with the original
               | books.
        
               | Ukv wrote:
               | > That succeeds on a different part of the four-factor
               | test, the degree to which it competes with / affects the
               | market for the original
               | 
               | It satisfied multiple parts of the four-factor test. It
               | was found satisfy the first factor due to being "highly
               | transformative", the second factor was considered not
               | dispositive is isolation and favoring Google when
               | combined with its transformative purpose, and it
               | satisfied the third factor as the usage was "necessary to
               | achieve that purpose" - with the court making the
               | distinction between what was copied (lots) and what is
               | revealed to the public (limited snippets).
               | 
               | As you had all factors as "maximally against" fair use,
               | do you believe that AI is significantly less
               | transformative than Google Books? I'd say even in cases
               | where the output is the same format as the content it was
               | trained on, like Google Translate, it's still generally
               | highly transformative.
               | 
               | > the degree to which it competes with
               | 
               | Specifically, to be pedantic, it's the effect _of the use
               | /copying_ of the original copyrighted work.
        
               | fenomas wrote:
               | Where this argument falls down for me is that "use"
               | w.r.t. copyright means copying, and neither AI models nor
               | their outputs include any material copied from the
               | training data, in any usual sense. (Of course the inputs
               | are copied during training, but those copies seem clearly
               | ephemeral.)
               | 
               | Genuinely curious: for anyone who thinks AI obviously
               | violates copyright, how do you resolve this? E.g. do you
               | think the violation happens during training or inference?
               | And is it the trained model, or the model output, that
               | you think should be considered a derived work?
        
               | frabcus wrote:
               | Personally I think trained models are derived works of
               | all the training data.
               | 
               | Just like a translation of a book is a derived works of
               | the original. Or a binary compiled output is a derived
               | works of some source code.
        
               | fenomas wrote:
               | Wikipedia:
               | 
               | > In copyright law, a derivative work is an expressive
               | creation that includes major copyrightable elements of
               | ... the underlying work
               | 
               | A trained model fails that on two counts, doesn't it?
               | Both the "includes" part, and the fact that a model is
               | itself not an expressive work of authorship.
        
               | thuuuomas wrote:
               | Curating training data is an exercise in editorial
               | judgement.
        
               | JAlexoid wrote:
               | You're trying to use words without the legal context
               | here. The legal definition of words isn't 1-1 wit our
               | colloquial usage.
               | 
               | Translation of a book is non-transformative and retains
               | the original author's artistic expression.
               | 
               | As a counter example - if you write an essay about
               | Picasso's Guernica painting, it is derivative according
               | to our colloquial use of the term, but legally it's an
               | original work.
        
               | barrkel wrote:
               | AI is a genie that you can't really stuff back into a
               | bottle. It's out and it's global.
               | 
               | If the US had tighter regulations, China or someone else
               | will take over the market. If AI is genuinely
               | transformative for productivity, then the US would just
               | fall behind, sooner or later.
        
               | beepbooptheory wrote:
               | Then let them! If another country put forward tighter
               | regulations to help actual people over and above the
               | state that holds them, then that is good in itself, and
               | either way will pay for itself. Why are we worried about
               | China or whoever taking over the market of something that
               | we see has bad effects?
               | 
               | Like, we see this line everywhere now, and it simply
               | doesnt make sense. At some point you just have to believe
               | something, be principled. Treating the entire world as
               | this zero sum deadlock of "progress" does nothing but
               | prevent one from actually being critical about anything.
               | 
               | This would-be Oppenheimer cosplay is growing really old
               | in these discussions.
        
             | iamsaitam wrote:
             | The point should be to kill training on unlicensed
             | material. There needs to be regulation and tools to
             | identify what was the training data. But as always, first
             | comes the siphoning part, the massive extraction of value,
             | then when the damage is done there will be the slow moving
             | reparations and conservationism.
        
               | cornel_io wrote:
               | A ton of us out here don't agree with your goals. I think
               | these models are transformative enough that the value
               | added by organizing and extracting patterns from the data
               | outweighs the interests of the extremely diffuse set of
               | copyright holders whose data was ingested. So regardless
               | of the technical details of copyright law (which I
               | _still_ think are firmly in favor of OpenAI et al) I
               | would strongly opposed any effort to tighten a legal
               | noose here.
        
               | JAlexoid wrote:
               | Agreed. And every software engineer writing code should
               | pay 10% of their salary to the publishers of the books
               | that they learned their programming skills from.
        
             | dkjaudyeqooe wrote:
             | That makes no sense. OpenAI must lose and it must not be
             | possible to have proprietary models based on copyrighted
             | works. It's not fair use because OpenAI is profiting from
             | the copyright holders work and substituting for it while
             | not giving them recompense.
             | 
             | The alternative is that any models widely trained on
             | copyrighted work are uncopyrightable and must be disclosed,
             | along with their data sources. In essence this is forcing
             | all such models to be open. This is the only equitable
             | outcome. Any use of the model to create works has the same
             | copyright issues as existing work creation, ie if
             | substantially replicates an existing work it must be
             | licenced.
        
               | sillysaurusx wrote:
               | For what it's worth, I agree with your second paragraph.
               | But it would take legislation to enforce that. For now,
               | it's unclear that OpenAI will lose. Quite the opposite;
               | I've spoken with a few lawyers who believe OpenAI is on
               | solid legal footing, because all that matters is whether
               | the model's _output_ is infringing. And it's not. No one
               | reads books via ChatGPT, and Dalle 3 has tight controls
               | preventing it from generating Pokemon or Mario.
               | 
               | All outcomes suck. The trick is to find the outcome that
               | sucks the least for the majority of people. Maybe the
               | needs of copyright holders will outweigh the needs of
               | open source, but it's basically guaranteed that open
               | source ML will die if your first paragraph comes true.
        
               | dr_dshiv wrote:
               | Proposal: revenue from Generative AI should be taxed 10%
               | for an international endowment for the arts. In exchange,
               | copyright claims are settled.
        
               | Filligree wrote:
               | With a minimum rate, such that no-one can pretend they're
               | getting no income from it.
               | 
               | We might apply that as a $5000 or so surcharge on AI
               | accelerators capable of running the models, such as the
               | 4090.
        
               | dkjaudyeqooe wrote:
               | > But it would take legislation to enforce that.
               | 
               | Absolutely true. That's the end game and we should be
               | working toward influencing that. It's within our power.
               | 
               | > I've spoken with a few lawyers who believe OpenAI is on
               | solid legal footing
               | 
               | No one knows anything, this is too novel, and even if
               | OpenAI gets some fair use ruling, it will be inequitable
               | and legislation is inevitable. OpenAI is between a rock
               | and a hard place here. If you read the basis for fair use
               | and give each aspect serious consideration, as a judge
               | should do, I can't see it passing fair use muster. It's
               | not a case of simply reproducing work, which in unclear
               | here, it's the negative effect on copyright holders, and
               | that effect is undeniable.
               | 
               | > All outcomes suck.
               | 
               | I don't think so. It's possible to fashion something
               | equitable, but people other than the corporations have to
               | get involved.
        
               | Joeri wrote:
               | Just because something is not copyrightable doesn't
               | automatically mean it must be disclosed. If weights
               | aren't copyrightable (and I don't think they should be,
               | as the weights are not a human creation), commercial AI's
               | just get locked behind API barriers, with terms of usage
               | that forbid cloning. Copyright then never enters the
               | picture, unless weights get leaked.
               | 
               | Whether or not that's equitable is in the eye of the
               | beholder. Copyright is an artificial construct, not a
               | natural law. There is nothing that says we must have it,
               | or we must have it in its current form, and I would argue
               | the current system of copyright has been largely harmful
               | to creativity for a long time now. One of the most
               | damning statements I've read in this thread about the
               | current copyright system is how there's simply not enough
               | unlicensed content to train models on. That is the bed
               | that the copyright-holding corporations have made for
               | themselves by lobbying to extend copyright to a century,
               | and it all but assured the current situation.
        
               | dkjaudyeqooe wrote:
               | > Just because something is not copyrightable doesn't
               | automatically mean it must be disclosed.
               | 
               | No I'm saying that's what they law should be, because
               | models can be built and used without anyone knowing. If
               | it's illegal not to disclose them you can punish people.
               | 
               | Copyright is something that protects the little guy as
               | much as big corps. But the former has more to lose as a
               | group in the world of AI models, and they will lose
               | something here no matter what happens.
        
               | Hoasi wrote:
               | > I would argue the current system of copyright has been
               | largely harmful to creativity for a long time now
               | 
               | I'd love to hear that argument.
               | 
               | How has the current system of copyright been harmful to
               | creativity?
        
             | benreesman wrote:
             | The reality is always a dynamic tension between law,
             | regulation, precedent, and enforceability.
             | 
             | It is possible to strangle OpenAI without strangling AI:
             | pmarca is anti-OpenAI in print, but you can bet your butt
             | he hopes to invest in whatever replaces it, and he's got
             | access to information that like, 10 people do.
             | 
             | A useful example would be the Napster Wars: the music
             | industry had been rent seeking (taking the fucking piss
             | really) for decades and technology destroyed the free ride
             | one way or another. The public (led by the
             | technical/hacker/maker public) quickly showed that short of
             | disconnecting the internet, we were going to listen to the
             | 2 good songs without buying the 8 shitty ones. The
             | technical public doesn't flex its muscles in a unified way
             | very often, but when it does, it dictates what is and isn't
             | on the menu.
             | 
             | The public wants AI, badly. They want it aligned by them
             | within the constraints of the law (which is what "aligned"
             | should mean to begin with).
             | 
             | The public is getting what it wants on this: you can bet
             | the rent. Whether or not OpenAI gets on board or gets run
             | the fuck over is up to them.
             | 
             | "You in the market for a Tower Records franchise Eduardo?"
        
               | emadm wrote:
               | a16z are investors in openai
        
               | benreesman wrote:
               | I'd look again:
               | https://twitter.com/pmarca/status/1756803719327621141
        
             | pk-protect-ai wrote:
             | I would say that GPT-3 and its successors have nothing to
             | do with open source, and if OpenAI uses open source as a
             | shield, then we are all doomed. I would distance myself and
             | any open source projects from involvement in OpenAI court
             | cases as far as possible. Yes, they have delivered some
             | open source models, but not all of them. Their defense must
             | revolve around fair use and purchased content if they use
             | books and materials that were never freely available. It
             | should be permissible to purchase a book or other materials
             | once and use them for the training of an unlimited number
             | of models without incurring licensing fees.
        
             | chasing wrote:
             | > If you require licensing fees for training data, you kill
             | open source ML.
             | 
             | This is another one of those "well if you treat the people
             | fairly it causes problems" sort of arguments. And: Sorry.
             | If you want to do this you have to figure out how to do it
             | ethically.
             | 
             | There are all sorts of situations where research would go
             | much faster if we behaved unethically or illegally.
             | Medicine, for example. Or shooting people in rockets to
             | Mars. But we can't live in a society where we harm people
             | in the name of progress.
             | 
             | Everyone in AI is super smart -- I'm sure they can chin-
             | scratch and figure out a way to make progress while
             | respecting the people whose work they need to power these
             | tools. Those incapable of this are either lazy, predatory,
             | or not that smart.
        
               | sillysaurusx wrote:
               | "Ethical" in this case is a matter of opinion. The whole
               | point of copyright was to promote useful sciences and
               | arts. It's in the US constitution. You don't get to
               | control your work out of some sense of fairness, but
               | rather because it's better for the society you live in.
               | 
               | As an ML researcher, no, there's basically no way to make
               | progress without the data. Not in comparison with billion
               | dollar corporations that can throw money at the licensing
               | problem. Synthetic data is still a pipe dream, and
               | arguably still a copyright violation according to you,
               | since traditional models generate such data.
               | 
               | To believe that this problem will just go away or that we
               | can find some way around it is to close one's eyes and
               | shout "la la la, not listening." If you want to kill open
               | source AI, that's fine, but do it with eyes open.
        
               | chasing wrote:
               | Yes, it's true that open source projects that cannot pay
               | to license content owned by other people are at a
               | disadvantage versus those who can. Open source projects
               | cannot, for example, wholly copy code owned by other
               | people.
               | 
               | Also, beware of originalist interpretations of the
               | Constitution. I believe there's been about 250 years of
               | law clarifying how copyright works, and, not to beat a
               | dead horse, I don't think it carves out a special
               | exception for open source projects.
        
           | silviot wrote:
           | > But anyway, even if it were all true, the only reason we
           | are talking about diffusers, and the only reason we are
           | paying attention to this author's work Fairly Trained, is
           | because of someone training on data that was not expressly
           | licensed.
           | 
           | Thanks for putting this into words. I'm of the same opinion
           | and this is the best articulation I have so far.
        
         | prmoustache wrote:
         | Not that it would have stopped the company for doing it anyway,
         | but couldn't he think about that before working from them?
         | 
         | Or did he needed that as it i part of the business model of his
         | certfications?
        
           | emadm wrote:
           | It's a complex topic and perceptions change.
           | 
           | Ed still likes Stability, especially as we fully trained
           | stable audio on rights licensed data (bit different in audio
           | to other media types), offer opt out of datasets etc.
        
         | ImprobableTruth wrote:
         | Calling him "the person hired to build Stable Audio" seems a
         | bit misleading? He was in a executive position (VP of product
         | for Stability's audio group). An important position, but
         | "person hired to build" to me evokes the image of lead
         | developer/researcher.
         | 
         | I think that also helps in understanding his departure, since
         | he's a founder with a music background.
        
           | a_vanderbilt wrote:
           | It isn't unusual for those in leadership positions to use
           | such phrasing when talking about projects and products. It's
           | not a "taking credit" from the engineers sort of thing, but
           | rather about the leadership of the engineers.
        
             | ARandomerDude wrote:
             | Person A gets hired to write the software that is the
             | company's actual product.
             | 
             | Person B gets hired to observe Person A working, check
             | email, and be the audio output buffer for Jira.
             | 
             | Person B says "I built this."
             | 
             | That's dishonesty no matter what the titles are or how
             | important the emails were.
        
             | Zetaphor wrote:
             | Managing a group of people is not synonymous with doing the
             | actual knowledge work of researching and developing
             | innovations that enabled this technology. I find it hard to
             | believe that the contribution of his management somehow
             | uniquely enabled this group of engineers to create this
             | using their experience and expertise.
             | 
             | A captain may steer the ship, but they're not the one
             | actually creating and maintaining the means by which it
             | moves.
        
             | shon wrote:
             | Agreed. Leadership can sometimes bring actual value ;)
             | 
             | And to be clear, I'm not sure Ed would call himself that.
             | Those are my words, not his.
        
         | gcanko wrote:
         | There has to be a solution for the copyright roadbloacks that
         | companies encounter when training models. I see it no different
         | than an artist creating music which is influenced by the music
         | the artist has been listening throughout his whole life,
         | fundementally it's the exact same thing. You cannot create
         | music or art in general in a vacuum
        
       | jpc0 wrote:
       | > Warning: This website may not function properly on Safari. For
       | the best experience, please use Google Chrome
       | 
       | Do better
        
         | popalchemist wrote:
         | Have you ever heard of an MVP?
        
           | prmoustache wrote:
           | That would be pertinent if it wasn't just a static web page
           | with just text and some audio files to be played.
        
             | zamadatix wrote:
             | Reading about it, that ironically seems to be the exact
             | problem Safari has. I mean the page "works" in Safari it's
             | just you get these really random delays to the start of
             | some of the sounds with all sorts of web discussion threads
             | saying different ways to mitigate it on different
             | platforms. I don't really fault them for having the goal to
             | publish a paper and go the extra bit to make a friendly but
             | imperfect webpage instead of being website creators who
             | happen to publish papers on the side.
        
         | pmontra wrote:
         | By the way, it does work on Firefox Android. No idea of what
         | there is in Safari that's not standard in Chrome and Firefox.
        
         | Aachen wrote:
         | ...and recommend Firefox
         | 
         | is what you meant to say right? :)
        
       | lopkeny12ko wrote:
       | > We append "high-quality, stereo" to our sound effects prompts
       | because it is generally helpful.
       | 
       | It's hilarious that we've discovered you can get better outputs
       | from LLMs by simply nicely telling it to generate better results.
        
         | nine_k wrote:
         | Maybe sometimes you want an old cassette sound, or even older
         | scratched 78 rpm sound, etc. Computers, as usual, do what you
         | asked them to do, not what you meant.
        
       | ttul wrote:
       | I find it interesting that they are releasing the code and lovely
       | instructions for training, but no model. They are almost begging
       | anonymous folks to hook the data loader up to an Apple Music
       | account and go nuts. Not that I am suggesting anyone do that.
        
         | zamadatix wrote:
         | Speculatively it might have been part of an agreement with they
         | were given the licensed stock audio library from AudioSparx to
         | train on they wouldn't redistribute the resulting model.
        
       | jsiepkes wrote:
       | > Warning: This website may not function properly on Safari. For
       | the best experience, please use Google Chrome.
       | 
       | We've come full circle with the 90's and Internet Explorer. Well
       | I guess this time the dominant browser is opensource so that's
       | atleast something...
       | 
       | Can someone please create an animated GIF button for Chrome which
       | says: "Best viewed with Google Chrome"?
        
         | Maxion wrote:
         | Chrome isn't open source, chromium is. Best not to confuse the
         | two.
        
           | schleck8 wrote:
           | Chrome and Chromium are virtually identical except for Google
           | services, which aren't required to do anything with the
           | browser except for installing Chrome extensions that can
           | alternatively be sideloaded, so this is nitpicking.
        
             | berkes wrote:
             | It's essential nitpicking
        
             | urbandw311er wrote:
             | Jumping in to defend parent comment, there's nothing Open
             | Source about Google Chrome and it's highly relevant in this
             | context because they are notorious for putting technologies
             | and tracking in there that many people find objectionable.
        
             | forgotusername6 wrote:
             | Tangential, but I tried to build chromium the other day but
             | stopped when it said it required access to Google cloud
             | platform to actually build it. If something requires a
             | proprietary build system, does it matter that it's open
             | source?
        
               | nolist_policy wrote:
               | That is not true. See every distribution packaging
               | chromium.
               | 
               | In particular, this package[1] by openSUSE builds
               | completely offline. Many other distributions require
               | packages to build offline.
               | 
               | [1] https://build.opensuse.org/package/show/network:chrom
               | ium/chr...
        
               | forgotusername6 wrote:
               | I think I got my wires crossed with ChromiumOS which when
               | I last read the docs seemed to suggest that Google cloud
               | platform was required. I now can't find those specific
               | docs either so I retract my statement.
        
             | squeaky-clean wrote:
             | Don't forget media DRM built into Chrome but not Chromium.
        
           | m463 wrote:
           | I found this article to explain it well:
           | 
           | https://www.lifewire.com/chromium-and-chrome-
           | differences-417...
           | 
           | and there is a further ungoogled-chromium:
           | 
           | https://en.wikipedia.org/wiki/Ungoogled-chromium
        
         | superb_dev wrote:
         | Website works fine on safari too, I didn't notice any issues
        
           | nness wrote:
           | Same, I wonder what issue they thought they had...
        
             | earthnail wrote:
             | Safari is known to be troublesome when a webpage contains
             | many HTML audio players. It can get extremely slow and
             | unresponsive.
             | 
             | Every researcher I know in the audio domain uses Chrome for
             | exactly that reason. The alternative would be not to use
             | the standard HTML audio tag which would be ridiculous.
        
         | IndisciplineK wrote:
         | > Can someone please create an animated GIF button for Chrome
         | which says: "Best viewed with Google Chrome"?
         | 
         | Here you go:
         | 
         | <img src="data:image/gif;base64,R0lGODlhWAAfAKIAAAAAAP///zNm/zO
         | ZM//MM/8zM8zMzP///yH/C05FVFNDQVBFMi4wAwEAAAAh+QQFZAAHACwAAAAAWA
         | AfAAAD/wi63P4wyklnuDjrzbv/YNgpQWGehaGubOu+cCzP81KiJr02uqLvlYnBh
         | sv9fEMADXmEFAuRJKBELZR+SRVyAVRym40n6iGtVrG8rI/J7DHETx7RnGpqvYy7
         | Hr2Ai/NzGXVuem2FSnwAfnBcREWJbI2RiYt/ayRPWJqbQANPGgShoqGXU1anV5y
         | qQDAKA54nFwKzsxejpHimdC9beKsthjuvsCYBtMcBt6RlqKe8iMG/WbzDsMbHyM
         | q5VILPh3fQvr2IUuTA1cXY2bfbmc+9auLy8dMuANWe1+oCyezMj+/ClZtX6lK9c
         | +j0qes3qt2FYoPskTPIwsGeb9TwKcTGUJuUQys3YkwqtyfSOHMV8b3aWEsZgY83
         | IkqbeUelLGQdcTHTIJPmRT4qV2YY4LJUTBR2hnyDt2lBUJXajLpbkusgU01Onw4
         | rKrWKEapS6EmKJjKr1qhdT32t4UWpQXhkA957ijZtzERh6el10wBqQ4uBMPQsW1
         | UvRb4OqnmE8A8HH1bT3qKUEUSII8c+M/u0Ubmz58+gQ4seTXpBAgAh+QQFyAAHA
         | CwNAAMAKwAZAAADf2i63P4wyvkArSBfZ/fqHhcaIJOB55d26VQqKPnJsBxLrBbv
         | s3VqOBPN1iu+XMUaTzk8YoAtElCqg01HHid2E916v+DwNQz5+bRka2OcNr3M6hz
         | 0R1Xp3jnqOZ6X+vVreVNzf4RmXYV7an6EjCVjhiiCfXeBK5NujZp0bZ2eDAkAOw
         | ==">
         | 
         | Edit: View the button:
         | https://indiscipline.github.io/post/best-viewed-in-google-ch...
        
           | rikafurude21 wrote:
           | Surprised to see an actual gif pop up after adding that to a
           | site. I guess thats just base64, still kind of amazing that
           | its all inside a seemingly random string of text
        
             | IndisciplineK wrote:
             | By the way, you can simply paste the base64-encoded data
             | (everything inside the quotes) into your address bar to
             | view it. Probably not the safest action generally, but
             | should be OK if it's an image.
        
       | ecmascript wrote:
       | Just a few days ago I was down voted for stating AI will be
       | better in creating music than human would be:
       | https://news.ycombinator.com/item?id=39273380#39273532
       | 
       | Now this is released and now I feel I got grist to my mill.
       | 
       | Sure it still kind of sucks, but it's very impressive for a
       | _demo_. Remember that this tech is very much in it's infancy and
       | it's very impressive already.
        
         | larschdk wrote:
         | I don't find this music to be good in any way. It sounds
         | interesting over a few notes, but then completely fails to find
         | any kind of progression that goes anywhere interesting, never
         | iterating on the theme, never teasing you with subtle or
         | surprising variation over a core theme, no built-ups or clear
         | resolution. Very annoying to actually listen to.
        
       | webprofusion wrote:
       | Music is perfect for AI generation using trained models, because
       | artists have been copying each other for at least the past 100
       | years and having a computer do it for you is only notionally
       | different. Sure a computer can never truly know your pain, but it
       | can copy someone else's.
        
       | kleiba wrote:
       | My son suggested to play "Calm meditation music to play in a spa
       | lobby" and "Drum solo" at the same time - sounds pretty good,
       | actually...
        
       | PeterStuer wrote:
       | "Gen AI is the only mass-adoption technology that claims it's Ok
       | to exploit everyone's work without permission, payment, or
       | bringing them any other benefit."
       | 
       | Is it? What about the printing press, photography, the copier,
       | the scanner ...
       | 
       | Sure, if a commercial image is used in a commercial setting,
       | there is a potential legal case that could argue about
       | infringement. This should NOT depend on the production means, but
       | on the merit of the comparisons of the produced images.
       | 
       | Xerox should not be sued because you can use a copier to copy a
       | book (trust me kids, book copying used to be very, very big).
       | 
       | Art by its social nature is always derivative, I can use
       | diffusion models to create uncontestably original imagery. I can
       | also try to get them to generate something close to an image in
       | the training set if the model was large enough compared to the
       | training set or the work just realy formulaic. However. It would
       | be far easier and more efficient to just Google the image in the
       | first place and patch it up with some Photoshop if that was my
       | goal.
        
         | wnkrshm wrote:
         | But the social nature of art also means that humans give the
         | originator and their influences credit - of course not the
         | entire chain but at least the nearest neighbours of influence.
         | While a user of a diffusion generator does not even know the
         | influences unless specifically asked for.
         | 
         | Shoulders of giants as a service.
        
         | webmaven wrote:
         | _> Xerox should not be sued because you can use a copier to
         | copy a book (trust me kids, book copying used to be very, very
         | big)._
         | 
         | The appropriate analogy here isn't suing Xerox, but suing
         | Kinko's (now FedEx Office).
         | 
         | And it isn't just books, but other sorts of copyrighted
         | material as well, such as photographs, which are still an
         | issue.
        
         | haswell wrote:
         | > _Art by its social nature is always derivative, I can use
         | diffusion models to create uncontestably original imagery_
         | 
         | How are you defining "uncontestably original" here?
         | 
         | The output could not exist if not for the training set used to
         | train the model. While the process of deriving the end result
         | is different than the one humans use when creating artwork, the
         | end result is still derived from other works, and the degree of
         | originality is a difference of degree, not of kind when
         | compared to human output. (I acknowledge that the AI tool is
         | enabled by a different process than the one humans use, but I'm
         | not sure that a change in process changes the derivative nature
         | of all subsequent output).
         | 
         | As a thought experiment, imagine that assuming we survive,
         | after another million years of human evolution, our brains can
         | process imagery at the scale of generative AI models, and can
         | produce derivative output taking into account more influences
         | than any human could even begin to approach with our 2024
         | brains.
         | 
         | Is the output no longer derivative?
         | 
         | Now consider the future human's interpretation of the work vs.
         | the 2024 human's interpretation of the work. "I've never seen
         | anything like this", says the 2024 human. "The influences from
         | 5 billion artists over time are clear in this piece" says the
         | future human.
         | 
         | The fundamental question is: on what basis is the output of an
         | AI model original? What are the criterion for originality?
        
         | zamadatix wrote:
         | Where was this quote pulled from? I can't find it in the site,
         | paper, or code repo readmes for some reason. Did the HN link
         | get changed?
        
       | slicerdicer1 wrote:
       | obviously someone shadowy and non-corporate (eg. an artist) just
       | needs to come out and make a model which includes promptable
       | artist/producer/singer/instrumentalist/song metadata.
       | 
       | describing music without referring to musicians is so clunky
       | because music is never labelled well. of course saying "disco
       | house with funk bass and soulful vocals, uplifting" is going to
       | be bland. Saying "disco house with nile rodgers rhythm guitar,
       | michael mcdonald singing, and a bassline in the style of patrick
       | alavi's power" is going to get you some magic
        
         | ever1337 wrote:
         | so this model can only ever understand music which is
         | classified, described, labelled, standardized. and recombine
         | those. sounds boring, sounds like the opposite of what (I would
         | like to believe) people listen to music for, outside of a
         | corporate stock audio context.
        
       | gregorvand wrote:
       | Not trying to knock the progress here, impressive. As a drummer,
       | 'drum solo' is about as boring as it gets and some weird
       | interspersing sounds. So, it depends on the intended audience.
       | 
       | FWIW the sound effects also are not 'realistic' to my ear, at the
       | moment.
       | 
       | But again, the progress is huge, well done!
        
         | toxik wrote:
         | Yeah the drum solo really highlights how badly the model missed
         | the point in a drum solo. I'm not a drummer, but this is just
         | not pleasing to hear. Sounds like somebody randomly banging
         | drums more or less in tempo.
         | 
         | It does okay with muzak-type things though, which I guess
         | tracks with my expectations.
        
         | ZoomZoomZoom wrote:
         | As a drummer, the 'drum solo` was surprisingly interesting to
         | listen to, if you consider it happening over a stable 4/4
         | pulse. The random-but-not-quite nature of the part makes for
         | very unconventional rhythmic patterns. I'd like to be able to
         | syncopate like this on the spot.
         | 
         | Don't ask me to transcribe it.
         | 
         | Tempo consistency is great. Extraneous noises and random cymbal
         | tails show the deficiency of the model though.
        
         | redman25 wrote:
         | I think I was more disappointed by the music samples not having
         | any transitions. Most songs have key changes and percussion
         | turnovers.
        
       | TrackerFF wrote:
       | Now, if they can also generate MIDI-tracks to accompany - that'd
       | be great.
       | 
       | That would add some much-needed levels of customization.
        
       | seydor wrote:
       | Trying to describe music with words is awkward! We need a model
       | that is trained on dance
        
       | zdimension wrote:
       | The few examples I was able to play are very promising,
       | unfortunately the host seems to be getting some sort of HN-hug,
       | because all the audio files are buffering every other second --
       | they seem to throttle at 32 KiB/s.
        
       | Jeff_Brown wrote:
       | Music without changes is boring. I enjoyed the much less stable
       | results of OpenAI's JuleBox (2021?) more than any music AI to
       | come since. Their sound quality is better but they only seem to
       | produce one monotonous texture at a time.
        
         | coldcode wrote:
         | As a musician, I found the pieces unremarkable. Of course, a
         | lot of contemporary music is forgettable as well, as people try
         | to create songs that all sound like hits but, in doing so,
         | create uninteresting songs. I wonder what music the model is
         | based on. I suppose for game music/sounds, perhaps its good
         | enough?
        
       | 3cats-in-a-coat wrote:
       | The reconstruction demo is in effect an audio compression codec.
       | And I bet it makes existing audio codecs look like absolute toys.
        
       | emadm wrote:
       | This is part of a paper on the prior version of the model:
       | https://x.com/stableaudio/status/1755558334797685089?s=20
       | 
       | https://arxiv.org/abs/2402.04825
       | 
       | Which outperforms similar music models.
       | 
       | The pace is accelerating and even better ones are coming with far
       | greater cohesion and... stuff. Will be quite the year for music.
        
         | emadm wrote:
         | Particularly interesting with the scaled up version of
         | https://www.text-description-to-speech.com
         | 
         | Do try https://www.stableaudio.com for rights licensed model
         | you can use commercially.
        
       | nprateem wrote:
       | The problem with music generation is difficulty in editing.
       | Photos and text can be easily edited, but music can't be. Either
       | the piece needs to be MIDI, with relevant parameterisation of
       | instruments, or a UI creating that allows segments of the audio
       | to be reworked like in-painting.
        
       | Wistar wrote:
       | A small point: Needs to be in something other than 44.1kHz. The
       | two to which they make comparisons are at either 32kHz or 48kHz,
       | both of which are friendlier for video work, something for which
       | I think AI audio will be used a lot.
        
       | m3kw9 wrote:
       | Lots of work left to do man
        
       ___________________________________________________________________
       (page generated 2024-02-13 23:01 UTC)