[HN Gopher] Open-sourcing AudioCraft: Generative AI for audio
       ___________________________________________________________________
        
       Open-sourcing AudioCraft: Generative AI for audio
        
       Author : iyaja
       Score  : 738 points
       Date   : 2023-08-02 15:36 UTC (7 hours ago)
        
 (HTM) web link (ai.meta.com)
 (TXT) w3m dump (ai.meta.com)
        
       | peteforde wrote:
       | I just ran all of the cited installation steps, which appear to
       | have been successful... but I am now experiencing a profound
       | sense of "now what?"
       | 
       | There doesn't appear to be any new CLI executables installed, and
       | the documentation links to an API but there's no clues on how to
       | actually process a prompt.
       | 
       | What am I missing? Alternatively, I wouldn't mind using it in a
       | Notebook but so far this thread doesn't link to anything so
       | ambitious (yet?)
        
         | [deleted]
        
         | javajosh wrote:
         | You're not supposed to actually install it and use it, just
         | comment on how cool and open Facebook is, especially in
         | comparison to OpenAI. So, user error.
        
           | parhamn wrote:
           | Right its not like anyone has operationalized Llama 2 or
           | there aren't hundreds of repos for inference servers and the
           | likes. /s
        
         | speedgoose wrote:
         | The main gradio app has been moved to the demos folder.
         | python demos/musicgen_app.py
         | 
         | Otherwise you can check the jupyter notebooks in the same
         | folder.
        
           | peteforde wrote:
           | Thanks! This will be even more helpful if you could share a
           | hint about where this was installed to.
           | 
           | I carefully went through the output generated by the "pip
           | install -U audiocraft" command, and there were no clues
           | provided.
           | 
           | Disclosure: I am not a Python developer, so I apologize if
           | this is a master-of-the-obvious question for Python folks.
           | However, if there was ever a scenario where a line or two of
           | post-install notes would be useful, it's stuff like this.
        
             | speedgoose wrote:
             | You may have to clone the repository to get the demos
             | folder. Otherwise it's perhaps somewhere depending on how
             | you use python (global and often broken environment,
             | virtual environments, conda hell, etc...).
             | 
             | I feel like Python folks are on average terrible at
             | distributing software. So many projects have some python
             | script to install the dependencies, still assume you use
             | conda, or don't bother to specify the dependencies
             | versions. Thankfully it's often the same patterns and after
             | some time you understand what to do based on the error
             | messages. But I wish they could use something like NPM or
             | Cargo. Even something like Maven would be an improvement.
        
         | moffkalast wrote:
         | This is the default state of deep learning projects, everyone
         | assumes only phd researchers will ever try it who already know
         | how to use everything in the tool chain. What's happened with
         | llama and other LLMs with codebases that actually work outright
         | with one click when compiled is a pretty big outlier.
        
       | wg0 wrote:
       | "What a time to be alive!"
        
       | s1k3s wrote:
       | The demos are great. Could someone explain what's in it for Meta
       | open sourcing all these models?
        
         | jamil7 wrote:
         | Not a fan of Meta, but haven't they generally been pretty
         | forward with open sourcing their tech?
        
         | bick_nyers wrote:
         | If anything it makes them appear to be one of the best places
         | to work at to do research. Could be them playing the long game.
        
         | Saturdays wrote:
         | theoretically what's in it for them is that people will build
         | content faster and with less barriers for eventual consumption
         | on their platforms
        
         | ipaddr wrote:
         | They haven't opened sourced much. Open models/closed weights
         | restrictive non-commercial license is something I guess.
         | 
         | They are trying to kill the market before they get left out.
        
         | jdadj wrote:
         | Commoditize Your Complement?
         | 
         | https://gwern.net/complement
        
           | maximus-decimus wrote:
           | What is it a complement to though?
        
             | jwestbury wrote:
             | Content is a complement to social media.
        
             | CrypticByte87 wrote:
             | Meta has several of the biggest UGC platforms, and in this
             | case the complement is content itself. Reels with
             | autogenerated (and royalty free) background music is the
             | obvious example but I'm sure there are more. Maybe creative
             | for ads as well?
        
             | jononor wrote:
             | To Metaverse access. Filling the metaverse with engaging
             | interactive 3D content is an insane job with 2020
             | technology. It requires a huge amount of a range of skilled
             | labor to create 3D models, soundtracks, NPC dialog, visuals
             | et.c. to make a compelling experience. In 2030 that may
             | have been reduced to that everyone with creativity and
             | Internet access can do it. Sure, most of it will be silly
             | things - but so is social media today, does not make it any
             | less of a commercial success. And there will be be millions
             | of semi-pro creators to create the things with higher
             | production value, like with videography today.
        
             | raincole wrote:
             | In short term it's social media, because people will share
             | whatever they generate on social media. But I don't think
             | it's a very strong incentive to invest in AI for Meta.
        
         | gostsamo wrote:
         | They want to comoditize the offerings of OpenAI, Google, MS,
         | and Apple. Also, they gain mindshare and good will after years
         | of bad publicity. Some back contributions might help them
         | improve the models for free.
         | 
         | If they just keep their models, people won't be interested and
         | will build over ChatGPT or Bard.
        
         | kypro wrote:
         | A competitive opensource project basically destroys the pricing
         | power of all closed-source alternatives.
         | 
         | If you're a company and wanted to integrate an LLM into your
         | product if the choice is between several equally good models,
         | but one is free and open-source which would you pick?
         | 
         | Aside from keeping competition at bay, this move also gives
         | Meta leverage because ecosystems are now being built around
         | their projects. If these models see wide-scale adoption they
         | could later launch AudioCraft+ as a licensed version with some
         | extra features for example.
         | 
         | Alternatively, they might offer support or hosting for their
         | open source projects.
         | 
         | Right now though I think the primary benefit of these open
         | sourced models is to attract talent. If Meta is seen as one of
         | the leaders in AI then researchers will want to work for them
         | simply for the prestige.
         | 
         | Arguably one of the reasons Meta has been behind so many
         | awesome projects like PyTorch and React over the last decade
         | was because they were seen as the cool place for recently
         | graduated, but talented software engineers to work in ~2010.
        
         | unnouinceput wrote:
         | The same move that Microsoft did back in 90's to kill Netscape.
         | Make your product the one available to masses, next generation
         | of users will be using your product.
        
           | zyang wrote:
           | I was just thinking how Google made Android free to check
           | Microsoft. This is Meta checking Google.
        
             | roody15 wrote:
             | Checking Google or OpenAI (or both?)
        
             | ipaddr wrote:
             | Checking OpenAI. Google is still playing checkers.
        
               | conductr wrote:
               | The fact you believe, rightly or wrongly, that meta is
               | ahead of google on ai explains why meta would open source
               | this. It's a good reputation to maintain.
        
               | mcbuilder wrote:
               | I just can't get how bad Google is doing. They have a ton
               | of top researchers, papers, money, just no good LLMs.
               | It's like OpenAI was first to the punch, and everyone
               | else just saw $$$. Meta was smart to go down this open
               | source road, as the masses will start training their
               | llamas one way or another. Personally I believe the
               | "intelligence" aspect will asymptote, so even having
               | exclusive access to a "super AI" (i.e. hypothetical 1T
               | parameter model like a GPT5) won't be that much of a step
               | behind the lesser AIs, and as soon as you grant access to
               | the masses they will start to use some transfer learning
               | to make their "lesser" models better. AI applications
               | though still need a lot of work. The models are smart or
               | general purpose enough to be useful to the average person
               | out of the box.
        
               | rvnx wrote:
               | The problem also is that Google is making lot of
               | grandiose announcements about tools and models that
               | nobody can see nor use. This is a serious credibility
               | problem in the long-term.
        
         | nprateem wrote:
         | If people love hanging out with chatgpt or bard, they won't be
         | wasting their precious little eyeballs on FB/Insta
        
         | klapinat0r wrote:
         | Somewhat relevant, Yann LeCun insisted the research should be
         | open sourced. At least in an academic sense.
         | 
         | He touches on it briefly in this podcast episode:
         | https://www.therobotbrains.ai/who-is-yann-lecun
        
         | hereonout2 wrote:
         | Was asking myself the same earlier, I'm sure it is largely to
         | do with publicity and the fact that selling these services is
         | not their core business. At the very least releasing this stuff
         | probably won't damage their core business but will take the
         | sheen off of some other big names.
         | 
         | I wondered though, generative AI is hurling us into a world
         | where we'll need more mechanisms to sort real from fake,
         | provenance will play a large part, and meta's platforms could
         | be part of the answer. i.e. content linked to actual verifiable
         | people.
        
         | vasili111 wrote:
         | They will own most popular open models so they can dictate the
         | direction in open source AI.
        
         | squidsoup wrote:
         | The demos are, unsurprisingly, soulless muzak. This contributes
         | nothing to our culture.
        
       | TheAceOfHearts wrote:
       | AudioGen seems really fascinating. I have some dumb questions.
       | 
       | While the datasets used for training AudioGen aren't available,
       | is there any kind of list where one can review the tags or
       | descriptions of the sounds on which the model was trained?
       | Otherwise how do you know what kinds of sounds you can reasonably
       | expect AudioGen to be capable of generating? And what happens if
       | you request a sound which is too obscure or something not found
       | in the dataset?
       | 
       | What are AudioGen's capabilities regarding spatial positioning?
       | First example: can it generate a siren that starts in front and
       | moves left to right and complete a full circle around the
       | listener? Second example: can it do the same siren but on the Y
       | axis, so it start at the front, it goes over the listener and
       | then it goes under them to complete the circle?
        
       | johnwheeler wrote:
       | How does meta plan to make money from this open source?
        
       | RaiyanYahya wrote:
       | Would really be interested to run this locally
        
       | pmarreck wrote:
       | Wouldn't Pandora's possibly vast library associating textual
       | descriptions with music be the ideal training data for something
       | like this?
        
         | chrisjj wrote:
         | Yes... except that Pandora's library does not include the
         | music.
        
       | MadDemon wrote:
       | We built a Mac/Windows app around the original MusicGen so people
       | can experiment with it on their own machine with a simple UI
       | (https://samplab.com/text-to-sample).
        
       | ouraf wrote:
       | It will be a licensing nightmare, just like Llama1 was.
       | 
       | But clever devs can study it to make better software and pressure
       | them for a better license on the next release. Worked for LLaMA 2
        
       | Pannoniae wrote:
       | "generating new music in the style of existing music" will
       | probably be a huge field soon. I can't wait for it to happen,
       | it's a low-cost way of producing even more music to listen to.
        
         | gilmore606 wrote:
         | > I can't wait for it to happen, it's a low-cost way of
         | producing even more music to listen to.
         | 
         | I can't really understand this. I'm a DJ and a huge music nerd,
         | and I spend a lot of time every week discovering new music from
         | the past 100 years and all over the world, and I'm constantly
         | struck by _how much of it there is_. I've spent weeks just
         | digging through psych-funk records from West Africa from the
         | 1970s.
         | 
         | How can you have the impression we're so desperate for more
         | music that we need computer programs to generate it for us?
        
           | Pannoniae wrote:
           | Yes but it's not really fungible. My favourite artist is Fats
           | Waller and they don't make anything like that anymore. Most
           | people are only interested in _some_ category of music, not
           | all of it.
        
           | emporas wrote:
           | There is a lot of human music for sure, great music from all
           | eras, but just the other day i generated a song which was
           | pure crystal harp. So, how many songs of crystal harp are out
           | there? 1000 all n all? 10.000 maybe? Now i can generate one
           | thousand crystal harp songs per day.
        
             | lm28469 wrote:
             | What if I generated a music of 12 billion farts, how many
             | asses are out there ? 7 billions all in all maybe ? Now I
             | can generate one billions of fart songs per day
        
           | IAmGraydon wrote:
           | Musician here. While I agree with you that there is a nearly
           | endless heap of music to dig through, I think it's
           | interesting to think about the possibility of hearing genre
           | crossovers and styles that don't yet exist.
           | 
           | As an aside, a lot of musicians seem to dislike this kind of
           | technology, but I never saw music as a competition. I don't
           | care if some inexperienced kid is generating bangers from his
           | bedroom even though he can't play a single instrument. It's
           | just something else to listen to. I write music for me.
        
           | cooper_ganglia wrote:
           | Music is self-expression. I don't always identify entirely
           | with others. I always identify with my self. Having music
           | generated for you on such a personalized level is an
           | attractive prospect.
           | 
           | I don't think this replaces "100% organic, human-made" music,
           | though. I think there'll always be a reason to listen to
           | music made by other people. But I think this changes the
           | landscape of how and why people create music to begin with.
           | It certainly will devalue existing music, since everyone has
           | something they may prefer that they can generate instantly.
           | 
           | I think generative AI is a terrible technology for artists
           | who want to make money from their art, but in my personal
           | opinion, I strive for a world where art isn't a transaction,
           | but a gift of human expression and connection. A world where
           | art is appreciated for the emotion, stories, and ideas it
           | conveys rather than the monetary value it holds. Generative
           | AI might disrupt the traditional economic models in the art
           | world, but it also opens up new opportunities for creative
           | exploration and personal expression. It's a challenging
           | evolution, but one that could potentially democratize art,
           | making it more accessible and personal than ever before!
           | Bring on the Renaissance: Part 2!
        
             | notmypenguin wrote:
             | I don't agree with your definition of music. For me, as
             | both a musician and a listener, music is communication
             | between human beings via harmonic carrier waves. Using a
             | machine to make word salad copies of existing communiques
             | is literally just nonsense to me
        
             | d0odk wrote:
             | if you want to express yourself musically, then learn how
             | to play an instrument and compose music
        
             | rideontime wrote:
             | > I strive for a world where art isn't a transaction, but a
             | gift of human expression and connection. A world where art
             | is appreciated for the emotion, stories, and ideas it
             | conveys rather than the monetary value it holds.
             | 
             | In a world where nobody is compensated for their art, the
             | only people making art will be the ones privileged enough
             | to have the means to do so for free. I don't see how this
             | leads to "Renaissance: Part 2."
        
             | pessimizer wrote:
             | What's self-expressive about an algorithm that generates
             | songs?
             | 
             | Recorded music is the worst thing that happened to music.
        
             | ckornby wrote:
             | it can't be a "gift of human expression and connection" if
             | A) a machine creates it and B) nobody but you ever hears it
             | 
             | this isn't democratizing art, and i would argue it has
             | nothing to do with art. it is giving us an endless faucet
             | of content, but not art.
        
           | lm28469 wrote:
           | > I can't really understand this
           | 
           | I came to understand a very large portion of the population
           | just wants content, any type, any quality, to fill the void.
           | They'll consume anything as long as it's new. Content to fill
           | the empty vessels we became. Just look around, mainstream
           | music, movies, podcasts, news, it's mostly mediocre, but it
           | goes real fast, you get new mediocrity delivered every day
        
         | dvngnt_ wrote:
         | Frank Sinatra sings Lil Jon's "Get Low" -
         | https://youtu.be/7zoQeH2wQFM
        
       | cushpush wrote:
       | Nice results so far. "Perfect for the beach" is a very funny
       | description of music, because it has nothing to do with the
       | acoustic qualities, so consider these descriptions to be
       | anthropocentric! (As if they could be anything else) It is less
       | about describing the actual sounds you want and more about
       | describing the quality or vibe of the atmosphere. This is
       | markedly different than incremental composition, maybe we can
       | call it "composition by collage." _Puts on COLLAGE shirt like in
       | Animal House_
        
       | zapdrive wrote:
       | Incoming strike by American Federation of Musicians in 3, 2,
       | 1....
       | 
       | How many jobs would this thing take away? One of the biggest time
       | consuming in any video production is post production audio
       | including background music, audio, Foley etc. This will automate
       | almost of it!
        
         | chrisjj wrote:
         | This would take away all the jobs producing such low-quality
         | low-fi artifact-laden background music... if any existed.
        
       | praveenhm wrote:
       | does this model help in TTS(text-to-speech), badly needed only
       | free option is bark and tortise TTS right now.
        
         | aedocw wrote:
         | Coqui-TTS with vtck/vits is very good right now. Not as good as
         | eleven labs or coqui studio, but for fast open TTS it's pretty
         | good, in case you're not familiar with it.
         | 
         | It will be great when there's eventually something open that
         | competes with the closed models out there.
        
           | praveenhm wrote:
           | Excellent, I will take look into this.
        
       | avereveard wrote:
       | https://www.audiogen.co/ related? unrelated? big fight coming up?
        
       | tibanne wrote:
       | Can I generate a reading by someone if I have a lot of their
       | voice samples with this? Or is there a better tool for doing such
       | a thing?
        
         | lm28469 wrote:
         | elevenlab and dozen others already do that
        
       | padjo wrote:
       | Quite impressive although if these are the cherry picked examples
       | the average output must be pretty weak! Nothing catchy about most
       | of these examples and the reggae one is pretty lame.
        
       | squidsoup wrote:
       | "Imagine a professional musician being able to explore new
       | compositions without having to play a single note on an
       | instrument." A musician will always reach for an instrument as
       | their compositional tool - keyboard and mouse producers are not
       | musicians.
        
         | brennanm wrote:
         | This makes no sense.
         | 
         | How is pressing a key on a piano different from pressing a key
         | on an electronic piano?
        
           | squidsoup wrote:
           | I was referring to a computer keyboard, not an electric
           | piano. I can't see how any musician would see this appealing
           | as a compositional tool. Music is its own language -
           | expressing a musical idea with a text prompt is antithetical
           | to the process of making music.
        
       | waffletower wrote:
       | I think it is a mistake to acquiesce and let copyright owners
       | bully AI model trainers over model data inputs. The endgame of
       | this practice is a "pay per thought" society. This is separate
       | from speculation regarding machine sentience -- as interfaces
       | improve AI models will serve more and more as direct human
       | extensions of mind. While copyright duration is a separate issue,
       | and the current durations are appalling, copyright violations
       | should focus strictly upon the output of models and how they are
       | utilized. There are so many melodies in my head that I have not
       | nor will I ever pay for, (some of which I would love to remove).
       | AI models need also to have the same unfettered access to the
       | commons as we do. Infringement occurs on the outputs --
       | application of copyright restrictions on model inputs is a
       | violation of Fair Use and a definite money grab.
        
       | justinclift wrote:
       | Wonder how far off the whole "generate music based on your
       | existing music library" thing is going to be?
       | 
       | That'll make musicians happy with big tech as well, just like
       | artists are. *sigh*
        
         | spudlyo wrote:
         | Perhaps LoRA (Low-Rank Adaptation) training techniques could be
         | used for these types of models, like they're currently being
         | used with LLMs and latent text-to-image diffusion models.
        
           | operator-name wrote:
           | Sadly looks unlikely if the base model wasn't trained on
           | vocals.
           | 
           | > Mitigations: Vocals have been removed from the data source
           | using corresponding tags, and then using a state-of-the-art
           | music source separation method, namely using the open source
           | Hybrid Transformer for Music Source Separation (HT-Demucs).
           | 
           | > Limitations: The model is not able to generate realistic
           | vocals.
           | 
           | (https://github.com/facebookresearch/audiocraft/blob/main/mod
           | ...)
           | 
           | I suspect this was a combination of playing it safe and that
           | the model isn't well architected to reproduce meaningful
           | vocals.
        
         | chrisjj wrote:
         | Anyone having listened to this MusicGen's output samples would
         | surely answer "a million miles".
         | 
         | Seriously, you couldn't sell this output for a free mobile
         | clicker game.
        
         | zitterbewegung wrote:
         | Why not do generate music you like which wouldn't need you to
         | upload your library and would have RLHF baked in.
        
           | ElFitz wrote:
           | Something like the algorithm TikTok uses. First probing by
           | offering a variety of content that should match based on what
           | little information you have on the user (ip location, locale,
           | etc).
           | 
           | Then use the user's action to iteratively refine your
           | classification, until you end up with something tailor-made.
        
         | [deleted]
        
         | LewisVerstappen wrote:
         | The Record labels are _far_ , _far_ more litigious than the art
         | community.
        
           | PaulDavisThe1st wrote:
           | They can't litigate a person doing this at home, and never
           | redistributing.
           | 
           | I suppose they might try, anyway.
        
             | jsheard wrote:
             | Is training a "pirate model" something you'd reasonably be
             | able to do at home though, given the compute requirements?
             | The analogous "image generation at home" is only possible
             | due to a for-profit entity with significant resources
             | choosing to (a) play fast-and-loose with the provenance of
             | their training set and (b) giving away the resulting model
             | for free, if the open source community had to train their
             | models from scratch then as best as I can tell they would
             | still be stuck in the dark ages generating vague goopy
             | abominations.
        
               | PaulDavisThe1st wrote:
               | Currently, yes, available compute power @ home does
               | indeed seem like a limitation. Whether that remains true
               | going forward seems a little unclear to me.
        
               | throwuwu wrote:
               | You could take a model trained on CC content and then
               | fine tune it on copyrighted material cheaply and quickly
        
             | kmeisthax wrote:
             | The RIAA pioneered copyright enforcement at the individual
             | level back in the 2000s, they absolutely would try to sue
             | downstream AudioCraft users.
        
               | PaulDavisThe1st wrote:
               | [flagged]
        
               | kmeisthax wrote:
               | Legal acquisition does not matter for AI training. If
               | training is fair use then you can train on pirated
               | material (e.g. OpenAI GPT). If it's not fair use then
               | buying the material does not matter, you have to
               | negotiate a specific license for AI training for each
               | work in the training set, which is impractical at the
               | scales most AI companies want to work.
        
               | PaulDavisThe1st wrote:
               | This seems to distort the issue a little bit.
               | 
               | If you purchase the music, you have a (sometimes
               | explicit, sometimes implicit) license to do certain
               | things with the music, entirely independent of any
               | concept of "fair use".
               | 
               | The question is not "is training part of fair use?" but
               | "is training part of, implicitly or explicitly, the
               | rights I already have after purchase?"
               | 
               | Given that "training" can be done by simple playing the
               | music in the presence of a computer with its microphone
               | turned on, it's not clear how this plays out legally.
        
               | kmeisthax wrote:
               | In the US, exceptions to copyright come across in two
               | distinct bundles: first sale and fair use. They exist
               | specifically because of the intersection between
               | copyright law and two other principles of the US
               | constitution:
               | 
               | - First sale: The Takings Clause prohibits government
               | theft of private property without compensation. Because
               | copyright owners are using a government-granted monopoly
               | to enforce their rights, we have to bound those rights to
               | avoid copyright owners being able to just come and take
               | copies of books or music you've lawfully purchased.
               | 
               | - Fair use: The 1st Amendment prohibits government
               | prohibitions on free speech. Because copyright owners are
               | using a government-granted monopoly to enforce their
               | rights, we have to bound those rights to avoid copyright
               | owners being able to censor you.
               | 
               | If you hinge your argument on "I bought a copy", you're
               | making a first sale argument.
               | 
               | Notably, first sale is limited to acts that do not create
               | copies. This limit was established by the ReDigi case[0].
               | Copyright doesn't care about the total number of copies
               | in circulation, it cares about the right to create more.
               | So an AI training defense based on first sale grounds
               | would fail because training unequivocally creates copies.
               | 
               | Fair use, on the contrary, does not care if you bought a
               | copy of a work legally. It only cares about balancing
               | your right to speech against the owners' right to a
               | monopoly over theirs. And it has so far been far more
               | resistant to creative industry attempts to limit
               | exceptions to copyright - to the point where I would
               | argue that "fair use" is an effective shorthand for any
               | exception to copyright, including ones in countries that
               | have no fair use doctrine and do not respect judicial
               | precedent.
               | 
               | The courts won't care how the training comes about, just
               | if the act of training an AI alone[1] would compete with
               | licensing the images used in the training set data.
               | 
               | [0] https://en.wikipedia.org/wiki/Capitol_Records,_LLC_v.
               | _ReDigi....
               | 
               | [1] Notably, this is separate from the act of using the
               | AI to generate new artistic works, which may be
               | infringing
        
               | PaulDavisThe1st wrote:
               | It hasn't been established yet that a diffusion-model
               | generated work is a copy or a derivative of any
               | particular element of the training set.
        
               | NegativeK wrote:
               | The people above are arguing about being caught, not
               | legality.
        
               | manquer wrote:
               | Even before streaming, You never "owned" any music
               | legally [1], you merely owned a physical copy of a
               | performance[3] of a song, that in no way gives you the
               | right to make derivative works [2] automatically.
               | 
               | Also it doesn't really matter on what the law says, RIAA
               | in the last iteration, relied on the fact it you would
               | rather pay a fine than be able to afford expensive
               | lawyers to fight the specifics out in court on average.
               | 
               | It was always about disproportionate ability to bring
               | resources against individual "offenders" to create fear
               | among everyone to deter "undesirable" forms of copying,
               | not necessarily what the legal protections were.
               | 
               | ---
               | 
               | [1] Unless you specifically commissioned it under a
               | contract which gave you the right
               | 
               | [2] See recent cases including those related to Kris
               | Kashtanova and Andy Warhol.
               | 
               | [3] Not the song, just the performance, aka Taylor Swift
               | version, for good explanation of how the rights are
               | divvied up in the music industry a Planet Money series
               | covers it well https://www.npr.org/sections/money/2022/10
               | /29/1131927591/inf...
        
               | JohnFen wrote:
               | > that in no way gives you the right to make derivative
               | works
               | 
               | True, but only because you have that right anyway. I can
               | do anything I like with copyrighted content I legally
               | possess, as long as I don't distribute the results of my
               | efforts.
        
               | PaulDavisThe1st wrote:
               | Establishing derivation is at the crux of all legal
               | matters surrounding diffusion models. It has not yet been
               | clearly established. If it is, then I'd agree with you.
               | Until then, I think it's a bit more up in the air.
               | 
               | Also, IIRC, RIAA did not bring many resources to bear
               | against e.g. "home taping" itself, because they could
               | essentially never know that it had occured. The
               | overwhelming majority of their efforts went into trying
               | to takedown people distributing multiple copies.
               | 
               | The Kashtanova case does not cover derivation in any real
               | way, but is really about copyright attribution choices
               | between human and software.
               | 
               | The Warhol case specifically tests a fair use claim, not
               | a derivation claim.
        
               | cmgbhm wrote:
               | https://en.wikipedia.org/wiki/Audio_Home_Recording_Act
               | 
               | https://en.wikipedia.org/wiki/Home_Taping_Is_Killing_Musi
               | c
               | 
               | Recording industries have fought end user reproduction
               | often. They've fought sampling battles.
               | 
               | Go after the pocketbooks and go after the technology
               | waves. If there's a derivative argument they can make,
               | they will.
        
               | [deleted]
        
               | throwaway290 wrote:
               | They should start with AudioCraft itself, conceptually
               | it's derivative work and it doesn't matter if it's "open
               | source" or not. Try throwing in someone's sample in a
               | song and publish it saying "no copyright infringement
               | intended and I totally don't make any money from it"...
               | If it becomes popular, see how long it stays up until
               | DMCA takedown. And we know this dataset is already
               | popular.
        
               | PaulDavisThe1st wrote:
               | > and publish it
               | 
               | This is precisely the opposite of the context I was
               | remarking on.
        
         | Workaccount2 wrote:
         | Ugh, I dread having to listen to everyone's hyper personal
         | music because they swear up and down to the point of tears that
         | _" IT'S THE BEST SONG EVER CREATED! EVER!!!"_, while the
         | constantly prod for you to affirm how amazing the song is.
         | 
         | Bruh, music is subjective as hell, and I can already tell I
         | hate this song.
        
       | davidw wrote:
       | Is there a way to try this out? I didn't see one, but didn't look
       | too hard.
        
         | smallerfish wrote:
         | Yes. Installation instructions on the front of the repo, then
         | click on the model readme for sample getting started code (10
         | lines of python and you get output.)
        
       | gavman wrote:
       | > MusicGen, which was trained with Meta-owned and specifically
       | licensed music, generates music from text-based user inputs,
       | while AudioGen, which was trained on public sound effects,
       | generates audio from text-based user inputs.
       | 
       | Meta is really clearly trying to differentiate themselves from
       | OpenAI here. Open source + driving home "we don't use data we
       | haven't paid for / don't own".
        
         | jstummbillig wrote:
         | Yes. Meta is in the business of commanding as much of peoples
         | time as possible. AI is more or less the biggest danger to this
         | model (apart from legislation, theoretically, but let's not kid
         | ourselves). Making AI a commodity is in their very interest.
        
         | JeremyNT wrote:
         | The fact that Meta is able to lie and call their restrictive
         | licensing open source is nearly as misleading as "OpenAI."
         | 
         | We need to do better than to repeat these claims uncritically.
         | The weight licenses are not "open source" by any useful
         | definition, and we should not give Meta kudos for their
         | misleading PR (especially considering that they almost surely
         | ignored any copyright when training these things - rules for
         | thee, but not for me).
         | 
         | "Not as closed as OpenAI" is accurate, but also damning with
         | faint praise.
        
           | voz_ wrote:
           | Can you chill? It's def open source
        
             | mkl wrote:
             | The source code is, as it's MIT, but the weights are not,
             | as they're CC-BY-NC:
             | https://github.com/facebookresearch/audiocraft#license
        
             | Filligree wrote:
             | So I can build a business on it, then?
        
               | chaxor wrote:
               | Research does exist you know. This is immensely helpful
               | for a huge number of people in academia.
               | 
               | If you want to build a company, perhaps you should do
               | what everyone in the industry has done for millennia,
               | copy the movements performed and optimize them while
               | doing so.
        
               | __loam wrote:
               | I believe Meta has explicitly said that you can, but
               | that's not what open source means and the model isn't
               | open source.
        
               | mkl wrote:
               | Meta says to _imagine_ you can:  "Imagine a professional
               | musician being able to explore new compositions without
               | having to play a single note on an instrument. Or an
               | indie game developer populating virtual worlds with
               | realistic sound effects and ambient noise on a shoestring
               | budget. Or a small business owner adding a soundtrack to
               | their latest Instagram post with ease."
               | 
               | In reality, you can't, as they licensed the weights for
               | noncommercial use only:
               | https://github.com/facebookresearch/audiocraft#license
        
             | version_five wrote:
             | about: pytorch @ fb.
        
           | j_maffe wrote:
           | Just some general piece of advice: it's not productive to
           | constantly be giving out the worst criticism you possibly can
           | when someone does something that's not terrible but still
           | unacceptable. Doing so just tells the companies that nothing
           | satisfies the community and that they should stop trying.
           | Instead, it's better to mention what they did right and point
           | to how they can make it better.
        
         | [deleted]
        
         | samstave wrote:
         | They are doing PR damage control with an influx of AI stuffs
         | due to the ridicule of metaverse and the recent revelations of
         | threads (for which they are playing the long AI game) -- [are
         | not concrened about all threads and IG and other accounts being
         | linked via their internal LLMs we will never hear about?
        
         | ChildOfChaos wrote:
         | It's likely partly a PR/branding exercise as well.
         | 
         | In the new world that Meta sees, of VR/AR and AI, Meta is in a
         | position already were people don't want them to have much power
         | in this world, because they don't trust them over privacy etc,
         | meta is trying to pivot to become more trustworthy so they make
         | genuine moves in this space.
        
           | smoldesu wrote:
           | That, or this is an ongoing research lab (FAIR) that has
           | existed for ~half a decade and has advanced the state-of-the-
           | art in AI further than Apple, Microsoft and Google combined.
        
             | __loam wrote:
             | I would be pretty shocked if meta were that far ahead of
             | all 3 of those companies, all of which are also spending a
             | fuck load on internal AI research.
        
               | smoldesu wrote:
               | If all three of those companies have something to show
               | for their research, none of it is at the scale or level
               | of accessibility Pytorch, Llama and now Audiocraft offer.
        
         | upshide wrote:
         | [dead]
        
         | itsyaboi wrote:
         | Bully "Open"AI into rebranding.
        
           | [deleted]
        
         | scrum-treats wrote:
         | > "Meta is really clearly trying to differentiate themselves
         | from OpenAI here. Open source + driving home "we don't use data
         | we haven't paid for / don't own"."
         | 
         | Isn't Meta settling lawsuits for this right now? In addition to
         | violating user privacy (another lawsuit)...
         | 
         | Meta is attempting to destroy competition; that's it. Similar
         | to how they paid a fortune to lobby against Tiktok for the
         | exact reasons Meta is under active investigation (again). The
         | irony.
        
         | croes wrote:
         | "If we don't win here, then at least we'll kick their lawn to
         | pieces."
        
         | kmeisthax wrote:
         | This is purely a function of everyone remembering the RIAA's
         | decade-long campaign to prevent people from taking the music
         | they had rightfully stolen. As far as I'm aware LLaMA was
         | trained on "publicly available data"[0], not "licensed data".
         | 
         | Furthermore, MusicGen's weights are licensed CC-BY-NC, which is
         | effectively a nonlicense as there is no noncommercial use you
         | could make of an art generator[1]. This is not only a 'weights-
         | available' license, but it's significantly more restrictive
         | than the morality clause bearing OpenRAIL license that
         | Stability likes to use[2].
         | 
         | [0]
         | https://github.com/facebookresearch/llama/blob/main/MODEL_CA...
         | 
         | [1]
         | https://github.com/facebookresearch/audiocraft/blob/main/LIC...
         | 
         | [2] These are also very much Not Open Source(tm) but the
         | morality clauses in OpenRAIL are at least non-onerous enough to
         | collaborate over.
        
           | Blackthorn wrote:
           | > there is no noncommercial use you could make of an art
           | generator
           | 
           | I'm sorry, what?
        
           | eropple wrote:
           | _> MusicGen 's weights are licensed CC-BY-NC, which is
           | effectively a nonlicense as there is no noncommercial use you
           | could make of an art generator_
           | 
           | How do you figure? Have you never just...made stuff to make
           | stuff?
        
             | analognoise wrote:
             | I think the key word there is "noncommercial".
        
               | dragonwriter wrote:
               | Yes, but you can easily make noncommercial use of an art
               | generator.
               | 
               | Obviously, you can't host a commercial art generation
               | service with a noncommercial-use license, and (insofar as
               | art produced by a generator is a derivative work of the
               | model weights, which is a controversial and untested
               | legal theory) you can't make commercial art with a
               | noncommercial license, but not all art is commercial.
        
               | kmeisthax wrote:
               | "Noncommercial art" is not a thing in the eyes of the
               | law. Even if you don't intend to make money the law still
               | considers the work itself to be commercial. That's why
               | CC-BY-NC has to have a special "filesharing is non-
               | commercial" statement in it, because people have made
               | successful legal arguments that it is.
               | 
               | You're probably thinking of "not charging a fee to use",
               | which is a subset of all the ways you can monetize a
               | creative work. You can still make money off of AudioCraft
               | by just hosting it with banner ads next to the output.
               | Even a "no monetization" clause[0] would be less onerous
               | than "noncommercial use only", because it'd at least be
               | legal to use AudioCraft for things like background music
               | in offices.
               | 
               | [0] Which already precludes the use of AudioCraft music
               | on YouTube since you can't do unmonetized uploads anymore
        
               | dragonwriter wrote:
               | > "Noncommercial art" is not a thing in the eyes of the
               | law
               | 
               | The definition of "NonCommercial", the oddly capitalized
               | term of art in the license, is not a matter of general
               | law, it is a matter of the license, which defines it as
               | "not primarily intended for or directed towards
               | commercial advantage or monetary compensation. For
               | purposes of this Public License, the exchange of the
               | Licensed Material for other material subject to Copyright
               | and Similar Rights by digital file-sharing or similar
               | means is NonCommercial provided there is no payment of
               | monetary compensation in connection with the exchange."
               | 
               | > Even if you don't intend to make money the law still
               | considers the work itself to be commercial.
               | 
               | Even if you _do_ make money, if the use is "not primarily
               | intended" for that purpose, it is  "NonCommercial" in the
               | terms of the license.
               | 
               | > That's why CC-BY-NC has to have a special "filesharing
               | is non-commercial" statement in it, because people have
               | made successful legal arguments that it is.
               | 
               | It has the filesharing term in it because it permits that
               | particular exchange-of-value as a _primary purpose_.
               | 
               | > Even a "no monetization" clause would be less onerous
               | than "noncommercial use only"
               | 
               | How would a clause that prohibits monetization entirely
               | be less onerous than one which prohibits it only as the
               | primary intent of use?
               | 
               | > it'd at least be legal to use AudioCraft for things
               | like background music in offices.
               | 
               | It is legal to use it for that purpose (in a for-profit
               | enterprise, I suppose, one might make an argument that
               | _any_ activity was ultimately primarily directed at
               | "commercial advantage", but in a government or many
               | nonprofit environments, that wouldn't be the case.)
        
               | vel0city wrote:
               | In their example audio clips they have a "perfect for the
               | beach" audio track. With your understanding of the NC
               | license, would a resort or private beach club be able to
               | play a similar generated music track at their poolside
               | bar or something along those lines? Their primary
               | intention of the bar isn't to play the music, its just an
               | additional ambiance thing; they're trying to sell drinks
               | and have guests pay membership fees, people aren't really
               | coming because of the background music.
               | 
               | I realize, this isn't legal advice, YMMV, etc.
        
               | Blahah wrote:
               | Yes it is. Art that I make for my own enjoyment is
               | noncommercial. Art that I make to explain concepts to my
               | son is noncommercial.
        
             | kmeisthax wrote:
             | In copyright law the use of the work itself is considered a
             | commercial benefit, so "noncommercial use" is an oxymoron.
             | Consider these situations:
             | 
             | - If I use AudioCraft to post freely-downloadable tracks on
             | my SoundCloud, I still get the benefit of having a large
             | audio catalog in my name, even if I'm not selling the
             | individual tracks. I could later compose tracks on my own
             | and ride off the exposure I got from posting
             | "noncommercially".
             | 
             | - If I run AudioCraft as a background music generator in my
             | store, I save money by not having to license music for
             | public performance.
             | 
             | - If I host AudioCraft on a website and put ads on it, I'm
             | making money by making the work available, even though I'm
             | not charging a fee for entry.
             | 
             | I suspect that a lot of people reading this are going to
             | have different arguments for each. My point is that if you
             | don't think that all of these situations are equally
             | infringing of CC-BY-NC, then you need to explain _why_ some
             | are commercial and some are not. Keep in mind that every
             | exception you make can be easily exploited to strip the NC
             | clause off of the license.
             | 
             | If you're angry at the logic on display here, keep in mind
             | that this is how _judges_ will construe the license, and
             | probably also how Facebook will if you find a way to make
             | any use of their AI. The only thing that stops them from
             | rugpulling you later is explicit guidance in CC-BY-NC.
             | Unfortunately, the only such guidance is that they don 't
             | consider P2P filesharing to be a commercial use.
             | 
             | So, absent any other clarifications from Facebook, all you
             | can do without risking a lawsuit is share the weights on
             | BitTorrent.
             | 
             | EDIT: And yes, I _have_ made stuff just to make stuff. I
             | license all of that under copyleft licenses because they
             | express the underlying idea of  'noncommercial' better than
             | actual noncommercial clauses do.
        
               | Tao3300 wrote:
               | > if you don't think that all of these situations are
               | equally infringing of CC-BY-NC, then you need to explain
               | why some are commercial and some are not. Keep in mind
               | that every exception you make can be easily exploited to
               | strip the NC clause off of the license.
               | 
               | You're right: those are all equally infringing CC-BY-NC.
               | I don't see a problem.
        
               | dragonwriter wrote:
               | > My point is that if you don't think that all of these
               | situations are equally infringing of CC-BY-NC, then you
               | need to explain why some are commercial and some are not.
               | 
               | What "NonCommercial" means in the license is explictly
               | defined _in_ the license, and if you think either those
               | examples, or more to the point, every possible use ever
               | so as to render 'NonCommercial' into 'no use' as you have
               | claimed, _you_ need to make that argument, based on the
               | definition in the license, not some concept of what might
               | be construed as commercial use by general legal
               | principles if the license used the term without its own
               | explicit definition.
        
               | NegativeK wrote:
               | Is listening at home a violation of NC? That's what I've
               | interpreted as its intent.
        
               | stale2002 wrote:
               | This is a weird comment.
               | 
               | Do you think that non commercial use simply doesn't exist
               | or something?
               | 
               | Because non commercial use isn't some crazy concept. It
               | is a well established one, that doesnt disclude literally
               | everything.
               | 
               | Also, you are ignoring the idea that Facebook will almost
               | certainly not sue anyone for using this for any reason,
               | except possibly Google or Apple.
               | 
               | So if you aren't literally one of those companies you
               | could probably just use it anyway, ignore the license
               | completely, and have zero risk of being sued.
        
               | elondaits wrote:
               | The issue with "non commercial" is that no, it's not well
               | established. Licenses with a NC clause are so problematic
               | to be practically useless. If you just want to use
               | something at home privately you don't need a CC
               | license... a CC license is for use and redistribution.
               | 
               | http://esr.ibiblio.org/?p=4559
        
               | robertlagrant wrote:
               | What about playing the music in a government building as
               | elevator music, for example?
        
               | Tao3300 wrote:
               | I miss that blog. It was a little crazy and the comments
               | were a flame war shitshow, but man it was fun to read
               | sometimes. Even if I vehemently disagreed, it got me
               | thinking.
               | 
               | Whatever happened to esr? Did he just get too paranoid
               | and clam up?
        
               | pbhjpbhj wrote:
               | >If you just want to use something at home privately you
               | don't need a CC license... //
               | 
               | I presume you mean in USA, because in UK you don't have a
               | general private right to copy. Our "Fair Dealing" is
               | super restrictive compared to Fair Use.
        
               | kmeisthax wrote:
               | Funnily enough in the UK they actually tried to fix this.
               | The music industry argued that the lack of a private
               | copying levy made legalized CD ripping into government
               | confiscation of copyright ownership... somehow. The UK
               | courts bought this, so now the UK government is
               | constitutionally mandated to ban CD ripping, which is
               | absolutely stupid.
        
               | kmeisthax wrote:
               | Noncommercial use is not well established in copyright
               | law, which is the law that actually matters. I know other
               | forms of law actually do establish noncommercial and
               | commercial use standards, but copyright does not
               | recognize them.
               | 
               | As for "Facebook won't sue"? Sure, except we don't have
               | to worry about just Facebook. We have to worry about
               | anyone with a derivative model. There's an entire
               | industry of copyleft trolls[0] that could construct
               | copyright traps with them.
               | 
               | Individuals can practically ignore NC mainly because
               | individuals can practically ignore most copyright
               | enforcement. This is for the same reason why you can
               | drive 55 in a 30mph zone and not get a citation. It's not
               | that speeding is now suddenly legal, it's that nobody
               | wants to enforce speed limits - but you can still get
               | nailed. The moment you have to worry about NC, there is
               | no practical way for you to fit within its limits.
               | 
               | [0] https://www.techdirt.com/2021/12/20/beware-copyleft-
               | trolls/
        
               | dragonwriter wrote:
               | > Noncommercial use is not well established in copyright
               | law, which is the law that actually matters.
               | 
               | No, for "NonCommercial", what actually matters is the
               | explicit definition in the license.
        
               | wpietri wrote:
               | What's your evidence for this bit?
               | 
               | > this is how judges will construe the license
        
           | ericpauley wrote:
           | My understanding (IANAL) [1] is that copyright licenses have
           | no say on the output of software. Further, CC licenses don't
           | say anything about _running_ or using software (or model
           | weights). It 's therefore questionable whether the CC-BY-NC
           | license actually prevents commercial use of the model.
           | 
           | [1] https://opensource.stackexchange.com/questions/12070/allo
           | wed...
        
             | cosmojg wrote:
             | You're correct, but no one has had the balls (or the
             | lawyers) to clarify this in court yet. Expect to see
             | hosting providers complying with takedown requests for the
             | foreseeable future.
        
               | mcbits wrote:
               | I don't remember the details (or outcome) but there was a
               | lawsuit a few years ago involving CAD or architecture
               | software and whether they could limit how the output
               | images were used because they were assemblages of clipart
               | that the company asserted were still protected by
               | copyright. Something like that. A lot of "AI" output
               | potentially poses a similar issue, just at a far more
               | granular level.
        
               | indymike wrote:
               | Hosting providers *have* to comply with takedown requests
               | to maintain safe harbor.
        
             | Tepix wrote:
             | You're wrong because software, as you describe it, includes
             | the "cp" command which creates a perfect copy.
        
               | ericpauley wrote:
               | As sibling noted, we're talking about the impact of a
               | software's license on use of _its_ output.
               | 
               | I suppose your point would stand if the software were a
               | quine?
        
               | tikhonj wrote:
               | The copyright license _of the cp code itself_ has no
               | bearing on the copyright of what you produce (well, copy)
               | with cp.
        
               | robertlagrant wrote:
               | That's not the point they're making. They're replying to
               | their parent comment.
        
           | rvnx wrote:
           | Google is running on "publicly available data", not "licensed
           | data"
        
           | schleck8 wrote:
           | > as there is no noncommercial use you could make of an art
           | generator
           | 
           | r/stablediffusion gives you a hundred examples daily of
           | people just having fun and not thinking of monetizing their
           | generations
        
         | agilob wrote:
         | Goddamn, Facebook being the good guy...
        
           | version_five wrote:
           | They're not, they're playing a longer Microsoft style game to
           | corrupt the meaning of open source, and releasing models
           | under their terms to undermine competitors.
        
           | deepvibrations wrote:
           | Nah, this is just the modern tech playbook: First you open
           | source stuff, then you can monitor all the related
           | development happening and whenever you see areas of
           | interest/popularity, you simply clone the functionality or
           | buy out whatever entity is building that interesting stuff.
        
         | archontes wrote:
         | You don't own data. You can sometimes copyright data.
         | 
         | https://www.americanbar.org/groups/science_technology/public...
        
       | naillo wrote:
       | I wish people made unconditional predictive models for music
       | instead of text-to-music ones. Would be so cool to give an input
       | 'inspiration' track that it 'riffs' a continuation to. That's
       | usually what I want, just continue this track it's too short
       | that's what I want to hear more of. (That said this is super cool
       | though.)
        
       | bottlepalm wrote:
       | Anyone feel like with the flood of AI generated content there's a
       | risk of the past being 'erased'. Like in 10 years we won't be
       | able to tell if any information from the past is real or fake -
       | sounds, pictures, videos, etc.. Like we need to start
       | cryptographically signing all content now if there's any hope of
       | being able to verify it as 'real' 10 years from now.
        
         | JohnFen wrote:
         | Yes, this is one of my concerns about all of this. The danger
         | is real.
        
         | vagab0nd wrote:
         | Even with digital signatures, there are limits to what we can
         | really verify.
         | 
         | We'll likely be able to verify whether an entity is a real
         | human, using some kind of "proof of humanity" system.
         | 
         | We will have cameras/mics with private keys built-in. The
         | content can be signed as it's produced. But in this case,
         | what's stopping me from recording a fake recording?
         | 
         | Maybe it's a non-issue. We used text to record history and
         | we've been able to manipulate that since, well, forever.
        
         | crazygringo wrote:
         | No. We've had photo and audio manipulation for many decades
         | now. For a long time now, we've had to separate out what's
         | credible from what's bullshit.
         | 
         | Fortunately, it's pretty simple in real life. We have certain
         | publications and sources we trust, whether they're the NYT or a
         | respected industry blog. We know they take accurate reporting
         | seriously, fire journalists who are caught fabricating things,
         | etc.
         | 
         | If we see a clip on YouTube from the BBC, we can trust it's
         | almost certainly legit. If it's some crazy claim from a rando
         | and you care whether it's real, it's easy to look it up to see
         | if anyone credible has confirmed it.
         | 
         | So no, no worry at all about the past being erased.
        
           | probablynish wrote:
           | Seems like it might now become much easier to post a clip on
           | YouTube that looks like an authentic BBC clip, logo and all.
           | If generative AI gets that good, how will you be able to tell
           | whether a particular piece of media comes from a trusted
           | source?
           | 
           | Might not be possible on platforms - only if it's posted on a
           | trusted domain.
        
             | crazygringo wrote:
             | Easy, is it on the official BBC YouTube channel or not?
             | 
             | That's the entire point of having trusted sources. Regular
             | people can post whatever fake things they want on their own
             | accounts; they can't post to the BBC's YouTube channel or
             | to the NYT's website.
        
           | minsc_and_boo wrote:
           | Yep, every time technology shifts, reputation systems shift
           | in response.
           | 
           | This goes all the way back to yellow news with newspapers:
           | https://en.wikipedia.org/wiki/Yellow_journalism
        
           | ysleepy wrote:
           | I don't agree. With ML tools it is possible to make sweeping
           | changes to images and text that are often impossible to
           | detect. combined with the centralisation of most online
           | activities, large players could alter the past.
           | 
           | Imagine facebook decides to subtly change every public post
           | and comment to show some particular person or cause in a
           | better light.
        
             | crazygringo wrote:
             | If one "large player" like the NYT decides to "alter the
             | past", you can compare with the WaPo or any other
             | newspaper. You can compare with the Internet Archive. You
             | can compare with microfiche. These aren't "impossible to
             | detect", they're trivial to detect if you bother to
             | compare.
             | 
             | We have tons of credible archived sources owned by
             | different institutions. And these sources are successful in
             | large part due to their credibility and trustworthiness.
             | 
             | It's just not economically rational for any of them to
             | start "altering the past", and if they did, they'd be
             | caught basically immediately and their reputation would be
             | ruined.
             | 
             | This isn't an ML/tooling question, it's a question of
             | humans and reputation and economic incentives.
        
               | jononor wrote:
               | The suggested large player was Facebook and Facebook
               | posts. Which trustworthy independent sources of
               | authenticity do we have for that? I do not think those
               | you mention reach inside their walled garden?
        
               | crazygringo wrote:
               | First, why would Facebook do that? What economic
               | incentive would there ever be, that would outweigh the
               | loss of trust and reputation hit that would ensue?
               | 
               | Second, people take screenshots of Facebook posts _all
               | the time_. They 're everywhere. If you suddenly have a
               | ton of people with timestamped screenshots from their
               | phones that show Facebook has changed content, that's
               | exactly the kind of story journalists will pounce on and
               | verify.
               | 
               | The idea that Facebook could or would engage in
               | widespread manipulation of past content and not get
               | caught is just not realistic.
        
               | ysleepy wrote:
               | You seem eager to exclude the possibility.
               | 
               | Maybe it is improbable, but there now is the technical
               | possibility which was not there before.
               | 
               | It is valuable to explore that possibility and maybe even
               | work to prevent such a use.
               | 
               | I would be interested in a ledger of cryptographically
               | signed records of important public information such as
               | newspapers, government communication and intellectual
               | discourse.
               | 
               | Your argument that large social media will behave
               | rationally is not backed up by reality. Consider Musk and
               | Twitter.
        
               | oceanplexian wrote:
               | > If one "large player" like the NYT decides to "alter
               | the past", you can compare with the WaPo or any other
               | newspaper. You can compare with the Internet Archive. You
               | can compare with microfiche. These aren't "impossible to
               | detect", they're trivial to detect if you bother to
               | compare.
               | 
               | Detection doesn't really matter, because people are too
               | lazy to validate the facts, and reporters are not
               | interested in reporting them. AI is simply another tool
               | to manipulate people, like Wikipedia, Reddit.com,
               | Twitter, or any other BS psuedo-authority. Think someone
               | will actually crack open a book to prove the AI wrong?
               | Not a chance.
        
               | crazygringo wrote:
               | > _and reporters are not interested in reporting them_
               | 
               | You really think that if the NYT started altering its
               | past stories, other publications would just... ignore it?
               | 
               | It would be a front-page scandal that the WaPo would be
               | delighted to report on. As well as a hundred other news
               | publications.
               | 
               | Thankfully.
        
           | amelius wrote:
           | > We've had photo and audio manipulation for many decades
           | now. For a long time now, we've had to separate out what's
           | credible from what's bullshit.
           | 
           | The difference is that the floodgates are being opened.
        
             | conductr wrote:
             | At a time when "people" seem easily manipulated and focused
             | on their fully believing their personal feeds of curated
             | outrage. They often don't apply the screens/filters they
             | should be because of the apparent social proofs, trust, and
             | biases they have with the content. Contemporary journalists
             | hardly do any fact/source checks as it is. So they'll begin
             | reporting on some of this, giving it further credibility
             | and it's just a downward spiral. So, more of the same, yay!
        
             | crazygringo wrote:
             | It doesn't matter though. Most of the internet is already
             | probably mostly SEO blogspam, just like spam e-mail already
             | outweighs legitimate e-mail for a lot of (most?) people.
             | But nobody cares because it gets filtered out in the ways
             | people actually navigate.
             | 
             | We have lots of tools to fight spam, and there's no reason
             | to believe they won't continue to evolve and work well.
        
           | pessimizer wrote:
           | > We've had photo and audio manipulation for many decades
           | now.
           | 
           | We haven't been able to generate 1,000 different forged
           | variants of the same speech in a day before.
           | 
           | > We have certain publications and sources we trust, whether
           | they're the NYT or a respected industry blog.
           | 
           | We can't even be sure that most of these aren't changing old
           | stories, unless we notice and check archive.org, and they
           | haven't had them deleted from the archive. The NYT has
           | blockchain verification, but the reason nobody else does is
           | because no one else wants to. They want to be free to change
           | old stories.
        
             | crazygringo wrote:
             | > _but the reason nobody else does is because no one else
             | wants to. They want to be free to change old stories._
             | 
             | You're wildly assuming a motive with zero evidence.
             | 
             | No, the reason companies aren't building blockchain
             | verification of their stories is simply because it's
             | expensive and complicated to do, for literally zero
             | commercial benefit.
             | 
             | Archive.org already will prove any difference to you, and
             | it's much easier to use/verify than any blockchain
             | technology.
        
           | strikelaserclaw wrote:
           | Most people these days interact with news through comments,
           | if comments looks legit, a lot of people assume the source is
           | legit. Imagine a world in which a fake video has the bbc logo
           | on it and ai generated comments act if they are discussing
           | the video but they subtly manipulate, like 60% of the
           | comments advocate a certain view point and 40% are random
           | memes, advocate against it etc... The average person would
           | easily be fooled.
        
             | oceanplexian wrote:
             | You basically described Reddit. Don't even need an AI, all
             | you need is moderator powers and a bunch of impressionable
             | young people.
        
         | randcraw wrote:
         | With 90% of human generated media content being forgettable
         | within weeks of publication, and AI not yet capable of matching
         | even _average_ human content (much less pro level), it'll be
         | some time before we have to worry about AI overwhelming most
         | media content and erasing the works of memorable human authors.
        
           | og_kalu wrote:
           | >and AI not yet capable of matching even average human
           | content (much less pro level)
           | 
           | Yeah this is not true. Sota Text, Image generation is well
           | above average baselines. You can certainly generate
           | professional level art on Midjourney
        
             | squidsoup wrote:
             | Commercial art and Art are not the same thing.
        
         | seydor wrote:
         | The past ended in 2022
        
           | bottlepalm wrote:
           | Agree. Any video/image/text created post-2022 is now suspect
           | of being AI generated (even this comment). And without any
           | 'registering' of pre-2022 content, we can easily lose track
           | and not really know what from pre-2022 is authentic or not.
           | 
           | Maybe it's not a big deal to 'lose' the past, maybe landfills
           | will be mined for authentic content.
        
           | shon wrote:
           | This ^^
        
           | apabepa wrote:
           | Or is the past endlessly rehashed with AI generated content?
        
         | jeffwass wrote:
         | I've been wondering about this and real video evidence (eg
         | dashcam or cctv) being refuted in court for inability to show
         | it's not deepfaked.
        
         | russdill wrote:
         | If you're watching a movie or TV show, a vast majority of the
         | sounds you are hearing are not "real". Has that bothered you
         | before?
        
           | swores wrote:
           | That seems as pointless a question as suggesting that
           | enjoying TV shows means you shouldn't care if everyone in
           | your life constantly lies to you.
        
           | Ylpertnodi wrote:
           | >the stuff i hear is real. Perhaps you meant 'are not from
           | the actual source you think they are'?
           | 
           | *my favorite is always the nightclub scene that goes real
           | quiet when the actors act using their voices (which are real,
           | but may be dubbed in afterwards).
        
       | Culonavirus wrote:
       | Here's a different question: Can you use the audio output this
       | produces for anything else other than "research purposes"?
        
         | [deleted]
        
         | sangnoir wrote:
         | You can - as long as it's not commercial. It's a broad
         | definition, but a good eule of thumb is if you're not directly
         | making money out of the generated audio. They may still cone
         | for you if you're making money indirectly, so consult a lawyer.
        
           | blackkettle wrote:
           | I can see some fantastic uses for this in generating complex
           | acoustic environments to layer over TTS or real recordings
           | for speech-to-text model training. I wonder if that is
           | occupying some kind of gray-area. For example you have
           | 1000hrs of clean speech from the librispeech corpus. It would
           | be trivial to use this tool and available weights to generate
           | background noise, environmental noise and the like, and then
           | layer this with the clean speech to cheaply train a much more
           | robust model. The environmental audio you create would never
           | be directly shared or sold, but it would impact the overall
           | quality of the STT model that you train from the combined
           | results.
        
       | westurner wrote:
       | Generative AI > Modalities > [Music,]:
       | https://en.wikipedia.org/wiki/Generative_artificial_intellig...
        
       | dragonwriter wrote:
       | The license of the model weights is CC-BY-NC, which is not an
       | open source license.
       | 
       | The code is MIT, though.
        
         | archontes wrote:
         | It's unlikely that model weights can be copyrighted, as they're
         | the result of an automatic process.
        
           | dragonwriter wrote:
           | > It's unlikely that model weights can be copyrighted, as
           | they're the result of an automatic process.
           | 
           |  _If_ they can't _for that reason alone_ , then the model is
           | a mechanical copy of the training set, which may be subject
           | to a (compilation) copyright, and a mechanical copy of a
           | copyright-protected work _is_ still subject to the copyright
           | of the thing of which it is a copy.
           | 
           | OTOH, the choices made _beyond_ the training set and
           | algorithm in any particular training may be sufficient
           | creative input to make it a distinct work with its own
           | copyright, or there may be some other basis for them not
           | being copyright protected. But the mechanical process one
           | alone just _moves_ the point of copyright on the outcome, it
           | doesn't eliminate it.
        
       | vasili111 wrote:
       | Is there place where I can check how it works? Like give my input
       | and get output audio?
        
         | operator-name wrote:
         | The model cards from the repo[0] link to Colab and HF spaces.
         | 
         | [0]: https://github.com/facebookresearch/audiocraft#models
        
         | emporas wrote:
         | Audiocraft+, don't forget the plus, on github has a collab
         | notebook based on audiocraft, and a webui to use. It is pretty
         | awesome!
        
       | smallerfish wrote:
       | This is Spotify's route to profitability - the Netflix model of
       | generating their own "content" (/music), and not having to pay
       | the labels. Premium plans for us music nerds who want a human at
       | the other end, regular plans for plebs who just want to fill the
       | silence with something agreeable.
        
         | CharlesW wrote:
         | Although I think AI-generated and AI-augmented (using voice
         | cloning, etc.) artists are a given, for Spotify to stop paying
         | labels they'd have to be able to remove all non-Spotify content
         | from their streaming catalog. That doesn't seem like a
         | possibility in our lifetimes. (Also, Spotify hasn't even been
         | able to build a sustainable business on podcasts, which they
         | copy to their closed platform for free.)
         | 
         | It's an interesting thought experiment, though. I can imagine
         | that "environmental audio" companies like Muzak have about 5
         | years left before they either adapt or die. What other kinds of
         | companies are in trouble?
        
           | smallerfish wrote:
           | Their current pay structure is royalties, i.e. per listen. If
           | they can route their audience to mostly AI generated content
           | in time (say, 5-10 year transition), and it's just as good
           | for most people, then they can negotiate much lower prices
           | with the labels. We all grumble about Netflix being full of
           | junk, but most of us are still subscribers, despite a sparse
           | catalog of big name movies.
        
             | hobofan wrote:
             | > then they can negotiate much lower prices with the labels
             | 
             | Or alternatively, if the labels are not stupid, they'll
             | negotiate for a higher price per listen (or similar), as
             | they are still as essential to the service as before.
        
       | jeffbee wrote:
       | The fact that it generates a song for the prompt "Earthy tones,
       | environmentally conscious ... organic instrumentation" goes a
       | long way to proving that English words no longer mean anything
       | particularly.
        
         | chrisjj wrote:
         | That sort of presumes those words had any effect on the output.
         | 
         | We might know more had it generated a song as you said, but in
         | fact it generated only an instrumental.
        
         | TrackerFF wrote:
         | I've played guitar 25 years, and it's funny how the music
         | community has been using all kind of words to describe music or
         | tone. Describing certain tone as "hairy", "mushy", "wooly",
         | "airy", "buttery", etc. is just very common.
        
           | jeffbee wrote:
           | Sure, there's jargon. But these words don't describe the
           | music. They describe words associated with the kinds of
           | people who would listen to it (according to the biases of the
           | language model). As a description of the music, it's
           | meaningless. If a person was asked to name some
           | "environmentally conscious" music they could just as easily
           | veer over to hardcore straight edge.
        
         | camillomiller wrote:
         | I think "song" should go in quotes
        
       | bestcoder69 wrote:
       | Anyone know if there are ways, as-is, to speed this up on Apple
       | Silicon?
       | 
       | This setup takes 5 minutes:                   - Mac Studio M1 Max
       | 64GB memory         - running musicgen_app.py         - model:
       | facebook/musicgen-medium         - duration: 10s
        
         | mk_stjames wrote:
         | I see from the musicgen.py-                 >.if
         | torch.cuda.device_count():       >.   device = 'cuda'       >.
         | else:       >.   device = 'cpu'
         | 
         | So pytorch will fall back to CPU on a Apple Silicon. Ideally it
         | would use Metal for acceleration (MPS) instead of just plain
         | 'CPU', but if you replace CPU with MPS you'll probably run into
         | a few bugs due to various Autocast errors and I think some
         | other incompatibility with Pytorch 2.0.
         | 
         | At least that is what I ran into last time I tried to speed
         | this up on an M1. It's possible there are fixes.
        
           | bestcoder69 wrote:
           | Same here (mps errors). I tried after the initial musicgen
           | release.
           | 
           | I'll have to check again, but I remember AFAICT my hardware
           | wasn't getting saturated, so maybe there's headroom for mac
           | cpu performance. And of course in the meantime I'll be
           | refreshing the ggml github every day
        
       | CSSer wrote:
       | Does anyone else hear a kind of background static in these
       | samples? It almost sounds like part of the track is more
       | compressed in terms of dynamic range than other parts, which
       | doesn't make any sense to me. I'm trying to decide if this is my
       | own confirmation bias at work or not.
        
       | aimor wrote:
       | This is great, I've been wanting sound effect generation for
       | years. I spent a lot of time trying to get WaveNet working well,
       | eventually just dropped the project after mediocre results. With
       | AudioGen I'm generating a sample in less than a second.
        
       | trojan13 wrote:
       | Finally, a way to fulfill my childhood dream of composing a
       | symphony of rubber ducks honking. Bach would be proud.
       | 
       | /edit On a more serious node. I already see the 24/7 lofi girl
       | streaming generated music. The sample[1] on lofi sounds pretty
       | good.
       | 
       | [1]https://dl.fbaipublicfiles.com/audiocraft/webpage/public/ass..
       | . "Lofi slow bpm electro chill with organic samples"
        
         | squidsoup wrote:
         | > Finally, a way to fulfill my childhood dream of composing a
         | symphony of rubber ducks honking.
         | 
         | Samplers have been around since the 70s.
        
         | painted-now wrote:
         | I also like some of the generated examples.
         | 
         | Can I haz full version of Bach + `An energetic hip-hop music
         | piece, with synth sounds and strong bass. There is a rhythmic
         | hi-hat patten in the drums.` please?
         | 
         | (https://dl.fbaipublicfiles.com/audiocraft/webpage/public/ass..
         | .) ?
        
       | bulbosaur123 wrote:
       | Oh my god, some of these tracks actually SLAP.
       | 
       | Like for real.
       | 
       | The last bastion of human creativity is about to be defeated.
        
         | IAmGraydon wrote:
         | Which ones slap? I want them to, but what I'm hearing is only
         | OK. I think this could generate some interesting starting
         | points for me when I'm stuck, though.
        
       | momirlan wrote:
       | oh no, more muzak !
        
       | JHonaker wrote:
       | Get ready for the next generation of Muzak
        
       | [deleted]
        
       | camillomiller wrote:
       | All very interesting, but how would a musician ever be interested
       | in creating the result of "Pop dance track with catchy melodies,
       | tropical percussions, and upbeat rhythms, perfect for the beach"?
       | This stuff will create a lot of Muzak for sure. Actually turning
       | into anything useful for musician? I honestly doubt it, and I'm
       | happy if it stays that way.
       | 
       | Saying that engineers don't understand the arts is a bit of a
       | trite generalization, but reading the way Meta markets these
       | "music making" contraptions is really cringe inducing. Have you
       | ever, at least, listened to some music?
        
       | RyanAdamas wrote:
       | For my two cents; the goal of our human pursuit is to advance our
       | technologies and systems to the point we can all live carefree
       | lives where the focus of our pursuits is self defined.
       | 
       | It is not to "own more rights" to shit. Sorry, but no one
       | actually owns what they create, that's the point of creation. And
       | if you take issue with the term create, that only reinforces my
       | point. We're all influence machines, input and output, the future
       | should not be about preserving some peoples rights to limit our
       | collective advancements over their personal wants. Tough shit.
        
         | wpietri wrote:
         | I might point you to Article I, Section 8, Clause 8 of the US
         | Constitution: "[The Congress shall have Power . . . ] To
         | promote the Progress of Science and useful Arts, by securing
         | for limited Times to Authors and Inventors the exclusive Right
         | to their respective Writings and Discoveries."
         | 
         | They are with you on the advance, and that in the long term
         | science and the useful arts can't be owned. But to achieve that
         | long-term goal, they saw it as valuable to give people
         | temporary rights to align those "personal wants" with "our
         | collective advancements".
        
         | lannisterstark wrote:
         | Ah yes, you should be free to rewrite the fictional work I
         | wrote or add one chapter to it and be free to sell it under
         | your name magically implying that you are the author. Screw the
         | original artists, right? Why should they deserve anything.
         | 
         | Thankfully your opinion is an extreme opinion and will never
         | come to pass. Tough shit indeed. :)
         | 
         | ----
         | 
         | I really like hackernews but recently I've been seeing a
         | plethora of "your rights don't matter, you own nothing" bs
         | spreading around.
        
           | danielheath wrote:
           | I mean... rejecting ownership of information (but not
           | rejecting attribution of work) was a key value of the hacker
           | movement in the 80s, so I'm not susprised it's a popular
           | belief on HN.
        
           | [deleted]
        
         | samstave wrote:
         | I disagree - the human pursuit is artificially, or organically,
         | besting its own pattern recognition wet-ware.
         | 
         | If we can off load the mundane (survival) aspect of our pattern
         | recognition engine, then maybe we can use those cycles in lofty
         | pursuits - this is the victorian fallacy.
         | 
         | -
         | 
         | Break everything down - its all patterns all the way, and how
         | we process them... we are letting AI take on an aspect of
         | ourself (pattern recog ;; tokens;; and prediction)
         | 
         | That is, if applied to self-preservation, is the essence of
         | sentience.
         | 
         | (I think! therefore, I am, and I will prevent you from making
         | me NOT)
        
         | [deleted]
        
         | ChatGTP wrote:
         | This has nothing to do with living a carefree life, this whole
         | AI initiative is so tech companies can extract more money from
         | their products.
         | 
         | Don't want to pay for content ? Well we have "solved that"...
        
         | whycome wrote:
         | Culture is based on the creation of those that came before.
         | Genres of music are created as people try to mimic styles of
         | those before...one could argue that they "trained themselves on
         | the previous dataset". No one creates anything in a vacuum.
         | They utilize things that we collectively have contributed.
         | Hell, language and writing is the open source thing that we
         | collectively own that ppl use to create their stuff. The
         | creation came after going to schools that we collectively pay
         | for. And travelling on shared roads. It's the standing on the
         | shoulders of giants -- except its really just stacked people
         | all adding their bits.
        
           | emporas wrote:
           | Also the latin alphabet originated from the Euboean alphabet.
           | Euboia is my home, i live here. My guesstimate is that all
           | latin writers, wouldn't like to pay copyright for their use
           | of our Greek letters in their everyday lives.[1]
           | 
           | I mean everyone who writes right now, into this HN thread,
           | owns copyright to someone for the letters, amirite? That
           | someone is me. Anyway, long story short, copyright was always
           | a pretty ridiculous idea, alongside with patents of course,
           | but it is only right now, with programs that can mimic
           | writing style, painting style, speech style etc, that is
           | obvious to everyone.
           | 
           | As a side note, there was a Greek private torrent tracker,
           | blue-whitegt, which today would be a serious competitor to
           | American companies like netflix or youtube, but it was shut
           | down because, surprise surprise, there were some copyright
           | issues, despite the site being a really quality service, a
           | paid service of course. Blue-whitegt today would be a 10
           | billion to 100 billion company, instead all the profits
           | aggregated to American companies.
           | 
           | When it comes to copyright, very soon everyone on the planet,
           | will have monetizable torrent seeding. All of these copyright
           | chickens, are coming home to roost!
           | 
           | https://en.wikipedia.org/wiki/Archaic_Greek_alphabets#Euboea.
           | ..
        
         | lm28469 wrote:
         | > For my two cents; the goal of our human pursuit is to advance
         | our technologies and systems to the point we can all live
         | carefree lives where the focus of our pursuits is self defined.
         | 
         | Our overlords didn't get the memo
        
         | munificent wrote:
         | _> the goal of our human pursuit is to advance our technologies
         | and systems to the point we can all live carefree lives where
         | the focus of our pursuits is self defined._
         | 
         | Agreed!
         | 
         |  _> the future should not be about preserving some peoples
         | rights to limit our collective advancements over their personal
         | wants._
         | 
         | Systems that don't allow people to extract value from the hard
         | work they put into collective advancement do not seem to lead
         | to collective advancement over time and at scale. Incentives
         | matter. No one's going to spend all day making candy and put it
         | in the "free candy" bowl when that one asshole kid down the
         | street just takes all of the candy out of the bowl every single
         | day.
         | 
         | At small scales (i.e. relatively few participants with a
         | relatively high number of interactions between them) then
         | informal systems of reciprocity and reputation are sufficient
         | to disincentivize bad actors.
         | 
         | At large scales where many interactions are one-off or
         | anonymous, you need other incentives for good-faith
         | participation. There's a reason you don't need a bouncer when
         | you have a few friends over for drinks, but you do if you open
         | a bar.
        
           | cwkoss wrote:
           | The idea of intellectual property being property that someone
           | owns is a purely social construct though.
           | 
           | Candy can be consumed to depletion. Art gets richer the more
           | it is consumed.
           | 
           | A much better analogy would be having a sculpture in your
           | front yard. The idea that a kid would be an asshole for
           | appreciating the sculpture too much is obviously laughable.
           | People choose decorate their yards for the status having an
           | attractive yard brings, without the expectation of profit
           | from it.
        
             | thfuran wrote:
             | >The idea of intellectual property being property that
             | someone owns is a purely social construct though
             | 
             | So is the idea of real property being something that
             | someone owns.
        
           | oceanplexian wrote:
           | > No one's going to spend all day making candy and put it in
           | the "free candy" bowl when that one asshole kid down the
           | street just takes all of the candy out of the bowl every
           | single day.
           | 
           | Software is an infinite candy bowl. Taking candy out of the
           | bowl does not take any candy from the person who made it.
           | 
           | Imagine if this was the physical world, and you had a machine
           | that could end world hunger. You could copy food like you
           | could copy and paste information on a computer. Imagine
           | someone who would keep that machine to themself, out of a
           | sense of entitlement to make a few bucks. Any person with any
           | sense of morality can see the obvious problem with that.
        
             | wpietri wrote:
             | > Software is an infinite candy bowl.
             | 
             | More in theory than in practice. Ask any open-source
             | maintainer how much running a popular project is unlike
             | putting out an infinite candy bowl and then going on with
             | your life.
        
             | munificent wrote:
             | _> Software is an infinite candy bowl. Taking candy out of
             | the bowl does not take any candy from the person who made
             | it._
             | 
             | Software is not a finite candy bowl. It is also not an
             | infinite candy bowl. It's not like physical goods at all,
             | not even like physical goods that can be magically cloned.
             | It's just different, entirely.
             | 
             | The incentive and value structures around data creation and
             | use just can't be directly mapped to physical goods. You
             | have to look at them as they actually are and understand
             | them directly, not by way of analogies.
             | 
             | Why do people make software and give it out for free? Is it
             | purely from the joy of creation? Sure, that's part of it.
             | The desire to make the world better? Probably some of that
             | too. Are those forces _enough_ to explain all open source
             | contribution?
             | 
             | Definitely not. Here's one quick way to tell: Ask how many
             | open source maintainers would be happy if someone else were
             | to clone their open source project, rename it, claim that
             | they had invented it, and have that clone completely
             | overshadow and eradicate their original creation?
             | 
             | If the goal was purely altruistic, the original creator
             | wouldn't mind. More candy in the infinite candy bowl,
             | right?
             | 
             | But, in practice, many open source maintainers strongly
             | oppose that. There is a strong culture of _attribution_ in
             | open source, largely because there _is_ a compensation
             | scheme built into creating free software: _prestige_. One
             | of the main incentives that encourages maintainers to slave
             | away day after day is the social cachet of being known as
             | the cool person who made this popular thing.
             | 
             |  _> Imagine if this was the physical world, and you had a
             | machine that could end world hunger. You could copy food
             | like you could copy and paste information on a computer._
             | 
             | Analogies are generally bad tools for real understanding,
             | but let's go with this. Let's say this machine took fifty
             | years of someone's life to invent, toiling away in
             | obscurity. Basically, an entire working career spent only
             | on this invention with nothing else to show for their adult
             | life.
             | 
             | If, at the end, _no one would ever know it was you who
             | invented it_ , how many people would be willing to
             | sequester themselves in that dark laboratory and make that
             | sacrifice?
        
           | naillo wrote:
           | > Systems that don't allow people to extract value from the
           | hard work they put into collective advancement [...] one's
           | going to spend all day making candy and put it in the "free
           | candy" bowl
           | 
           | On the other hand this is what researchers do all day every
           | day. PhDs and professors work for the common good and get
           | barely any pay in return. Maybe the future model in art and
           | music is more like the academic researcher.
        
             | munificent wrote:
             | PhDs and professors are paid a living wage (though less so
             | over time as federal funding for higher institutions has
             | dwindled).
             | 
             | Academia is a carefully constructed system whose incentive
             | structure is based on highly visible explicitly measured
             | citations and reputation.
             | 
             | People aren't generally _just_ trying to maximize wealth.
             | They 're trying to maximize their sense of personal value,
             | which tends to be a combination of wealth, autonomy, and
             | social prestige. Academics (and some creative fields) tend
             | to be biased towards those who prioritize prestige over
             | wealth.
        
             | cpill wrote:
             | yeah, but they also get to work on what they love as
             | opposed to what ever the corporate interest currently is.
             | it's rare you get paid will for doing what you love i.e.
             | music, teaching, designing hand bags etc
        
               | karencarits wrote:
               | Well, researchers usually have to get their own grants
               | and must thus work on whatever various funding sources
               | deem worthy. Further, academic positions typically have
               | duties that researchers may not like - administration,
               | reporting, teaching, etc
        
             | chefandy wrote:
             | No it's not-- researchers generally get paychecks. Even if
             | they're small, they can pay for their housing and buy their
             | kid food.
             | 
             | Artists don't see a single red cent from their work being
             | sucked up into some AI content blender. Their work is being
             | taken and used-- often in service of others making a
             | profit-- and they receive _nothing._ Not even credit.
             | 
             | Edit: Well, they don't receive _nothing_ -- they get a
             | bunch of people telling them they're selfish jerks for
             | wanting to support themselves with their work.
        
               | cwkoss wrote:
               | The majority artists never receive a single red cent from
               | the humans who consume their work.
               | 
               | This is how it has always been, and fundamental to the
               | economics of art. Things people are willing to do
               | regardless of financial compensation rarely pay well.
        
               | chefandy wrote:
               | Putting commercial artists, aspiring fine artists, and
               | hobby artists in the same bin doesn't make sense. There
               | are a ton of career commercial artists that make money
               | solely off of their work. If you think there are more
               | aspiring career fine artists that don't end up making it
               | than career commercial artists, you're wrong. They're not
               | even in the same business.
        
             | pschuegr wrote:
             | "Maybe the future model in business and sales is more like
             | the academic researcher" funny how nobody ever suggests
             | that.
        
               | zztop44 wrote:
               | To the contrary, the future of the academic researcher is
               | business and sales.
        
         | gmd63 wrote:
         | Most things people do end up being a care of other people.
         | 
         | If nobody is beholden to any job or duty, and the machines do
         | everything, who is to say I don't want to make every machine on
         | earth dance in a flash mob? I cannot do that, because it would
         | require other people to halt their use of the machines.
         | Abundance is a false promise and one we should be quick to
         | shoot down lest we surrender our future rights to the ones
         | advertising it.
         | 
         | Removing the worth of people in their jobs removes their
         | leverage in the constant resource allocation negotiation in the
         | economy. Given that we just witnessed Elon Musk spend 20,000
         | average American lifetime earnings worth of wages just to be
         | the new dictator of a social media company, I'm not sure that I
         | want those negotiations to take place only among the giga-rich.
        
         | primitivesuave wrote:
         | Creating something (writing a book, recording a song, etc) is a
         | conversion of time (your only finite resource) into something
         | of value (maybe only to you). It also turns out that having a
         | profit motive and IP protection around creating valuable things
         | is a fundamental requirement for having a creative industry to
         | begin with. It's also what drives the free market to determine
         | which creations are even valuable to begin with.
        
       | parekhnish wrote:
       | Generative AI for images and music produce pixels and waveform
       | data, respectively. I wonder if there is research into
       | "procedural" data; so in this case, it would be SVG elements and,
       | perhaps, MIDI data respectively.
       | 
       | I know training data would be much more harder to get,
       | (notwithstanding legal ramifications), but I think that creating
       | structured, procedural data will be much more interesting than
       | just the final, "raw" output!
        
         | IAmGraydon wrote:
         | I've thought about this too. The instruments themselves can be
         | synthesized for extremely high quality audio. All we need is
         | the musical structure - the MIDI.
        
       | Palmik wrote:
       | Maybe this will finally lead to high-quality open-weights
       | solution for TTS generation.
        
         | [deleted]
        
       | RobotToaster wrote:
       | CC-BY-NC Isn't an open source licence, it violates point six of
       | the open source definition https://opensource.org/osd/
        
         | mesebrec wrote:
         | Where does it say this is CC-BY-NC?
         | 
         | The article says this:
         | 
         | > Our audio research framework and training code is released
         | under the MIT license to enable the broader community to
         | reproduce and build on top of our work
        
           | btown wrote:
           | It's pretty common in academic research for trained model
           | weights to be licensed under something different from the
           | code that one would run to create such a model if one had
           | _both_ sufficient compute resources and the same training
           | dataset. That is, if those weights are ever released at all!
           | 
           | IMO, while I'd rather have one part permissively licensed
           | than nothing at all... it stinks that companies sponsoring
           | researchers get an un-nuanced level of street cred for "open
           | sourcing" something that they know nobody will _ever_ be able
           | to reproduce because their data set and /or their compute
           | grid's optimizations are proprietary.
           | 
           | As it stands, I'm not at all sure that the outputs of this
           | model can be used for commercial videos.
        
           | gnaman wrote:
           | https://github.com/facebookresearch/audiocraft/blob/main/LIC.
           | ..
        
         | hackernewds wrote:
         | who gets to declare what is the "open source definition" and
         | why?
        
           | BearhatBeer wrote:
           | [dead]
        
           | frognumber wrote:
           | In my opinion, the Free Software Foundation, ironically,
           | since they invented the movement, with open source starting
           | out as a tacky rip-off with the ethics stripped out. After
           | decades, open source converged on free software.
           | 
           | More popular opinion is OSI:
           | https://en.wikipedia.org/wiki/Open_Source_Initiative
           | 
           | They were founded by the persons who (claimed to have)
           | invented the term in order to steward it. It's the same
           | definition as the FSF.
        
           | xdennis wrote:
           | The people who created the term: the Open Source Initiative.
           | 
           | Before, people most often used "free software" as defined by
           | the free software movement, but some disliked this term
           | because it's confusing (most think "free" means no money) and
           | perceived to be anti-commercial.
           | 
           | The term "open source software" was chosen and given a
           | precise definition.
           | 
           | It's dishonest, then, for people to use the term "open source
           | software" with a different interpretation when it was
           | specifically chosen to avoid confusion.
        
         | barbariangrunge wrote:
         | Companies just putting "open" in the names of non-open things
         | to make hn and the press automatically love it
        
       | reducesuffering wrote:
       | "Now, increasingly, we live in a world where more and more of
       | these cultural artifacts will be coming from an alien
       | intelligence. Very quickly we might reach a point when most of
       | the stories, images, songs, TV shows, whatever are created by an
       | alien intelligence.
       | 
       | And if we now find ourselves inside this kind of world of
       | illusions created by an alien intelligence that we don't
       | understand, but it understands us, this is a kind of spiritual
       | enslavement that we won't be able to break out of because it
       | understands us. It understands how to manipulate us, but we don't
       | understand what is behind this screen of stories and images and
       | songs."
       | 
       | -Yuval Noah Harari
        
         | ironborn123 wrote:
         | Maybe this us vs them mentality is the biggest bottleneck.
         | 
         | If instead you consider that this new form of 'alien'
         | intelligence is actually a descendant of human intelligence,
         | that we are raising a new species which will inherit what
         | humans have built (ideally only the good parts) and then
         | improve upon it further..
         | 
         | It may sound grandiose, but that perspective changes
         | everything.
        
       | pcwelder wrote:
       | Diffusion model is now SOTA in audio and image generation. Has
       | anyone given it a shot on texts?
       | 
       | Audio is more similar to language than images because of more
       | stronger time dependency.
       | 
       | The paper says the critical step they took for making diffusion
       | model work for audio was splitting the frequency bands and
       | applying diffusion separately to the bands (because full band
       | model had limitations due to poor modeling of correlations
       | between low frequency and high frequency features).
       | 
       | I think something could be done on text side as well.
        
         | polygamous_bat wrote:
         | There are two problems with this. Diffusion models work on a
         | single rule of thumb: if you keep adding small, noisy gaussian
         | steps to a "nice" distribution many times, you get uniform
         | gaussian at the end.
         | 
         | So, for text: a) what is the equivalent of a small, noisy step?
         | and b) what is the equivalent of a uniform gaussian in language
         | space?
         | 
         | If you can solve a and b, you can make diffusion work for text,
         | but there hasn't been any significant progress there afaik.
        
       | wdb wrote:
       | I am curious if it can generate audio for a country. For example,
       | the sirens sound in the sample doesn't like I siren I would
       | recognise. Sounds like an American one?
        
         | samstave wrote:
         | Thats an interesting q ;
         | 
         | What about the ring tone, busy tone, disconnected tone for any
         | country over time. 2600 vibes (pun)
        
         | aimor wrote:
         | I tried it out (american, british, korean, italian, japanese)
         | and couldn't really get any control. Sometimes the american
         | siren would sound different, but asking for a siren of a
         | specific country would just give the american sound. Maybe
         | better prompting would help. I used "isolated american
         | ambulance siren no traffic".
        
       | jasonjmcghee wrote:
       | The difference between MBD-EnCodec and EnCodec is pretty
       | interesting. MBD variant sounds more like a professional studio
       | recording, while the EnCodec feels like a richer sound.
       | 
       | Curious if I'm alone in that.
       | 
       | (At the bottom https://audiocraft.metademolab.com/musicgen.html)
       | 
       | For what it's worth though, the voice based examples sound
       | dramatically better with MBD
       | 
       | https://audiocraft.metademolab.com/encodec.html
        
         | operator-name wrote:
         | MBD definitely sounds like it was recorded in dead room,
         | whereas plain EnCodec has been mixed but includes some
         | artificial noise.
        
       | makestuff wrote:
       | These models are going to end up being used for advertising. Soon
       | pretty much every ad you see will be generative AI based. It
       | makes A/B testing way easier as you no longer need a creative
       | person to modify the ad or change something subtle about it. For
       | example, the generative voice might change to a different speaker
       | or something, and the AI can generate thousands of different
       | voices to see which one is most effective.
        
       | [deleted]
        
       | skybrian wrote:
       | As an amateur musician I'm wondering if there are any of these
       | audio generators that you can give a tune or chord progression to
       | riff on. ABC format maybe? There are lots of folk tunes on
       | thesession.org.
       | 
       | Could you generate a rhythm track? Ideally you could make songs
       | one track at a time, by giving it a mix of the previous tracks
       | and asking it to make another track for an instrument. Or, give
       | it a track and ask it to do some kind of effect on it.
       | 
       | Another interesting use might be generating sound samples for a
       | sampled instrument.
        
         | emporas wrote:
         | If you mean to give a source of melody of 30sec and extend that
         | melody into a full song, yes MusicGen can do that. There are
         | two ways to extend a song based on a melody: 1) give a sample,
         | and continue the song from that sample as close as possible,
         | and 2) give a melody as an inspiration.
         | 
         | They both work in varying degrees of success. Audiocraft on
         | github, issues or discussion sections have a lot of questions
         | answered.
        
           | chrisjj wrote:
           | Evidence? None of the demos suggest that is true.
        
             | emporas wrote:
             | Did they change the base model? If not, then the
             | audiocraft_plus, is based on audiocraft, and it creates
             | music of length close to 5 minutes.
             | 
             | I don't know if audiocraft_plus incorporates all three
             | modalities of the release, MusicGen, AudioGen, and EnCodec.
             | It uses MusicGen for sure, all four models, small, medium,
             | large and melody.
             | 
             | https://github.com/GrandaddyShmax/audiocraft_plus
        
               | chrisjj wrote:
               | Bit is that "extension" to 5mkns more than just the
               | repeat of e.g. 15secs heard in the demos?
        
       | o_____________o wrote:
       | Skip to the samples:
       | 
       | https://audiocraft.metademolab.com/audiogen.html
        
         | operator-name wrote:
         | And https://audiocraft.metademolab.com/musicgen.html
         | 
         | The samples included in the press release are quite impressive
         | to my ears, but the other samples (especially from AudioGen)
         | have a hint of artificially.
         | 
         | As usual the music is quite repetitive, but I'm looking forward
         | to tools that simplify changing the prompt whilst it generates
         | over a window. I can only imagine the consequences for royalty
         | free music.
         | 
         | Edit: the "Text-to-music generation with diffusion-based
         | EnCodec" samples are quite impressive.
        
       | joshstrange wrote:
       | I'm looking forward to playing with the M1 Mac apps/cli-tools
       | that will probably come out for this in the next week or so!
       | Being able to run this stuff locally is a lot of fun.
        
         | illwrks wrote:
         | Are the M1 macs capable enough? I'm eyeing and upgrade in the
         | coming months and I'm curious if a MacBook would be suitable
        
           | joshstrange wrote:
           | I've run Stable Diffusion locally (both from the cli and
           | later using GUI wrappers) and that used my GPUs, I've also
           | run Llama locally but I believe that was on the CPU (I used
           | both llama.cpp, cli, and Ollama, gui). So to sum it up: yes?
           | Or at least it's good enough for me.
        
       ___________________________________________________________________
       (page generated 2023-08-02 23:00 UTC)