[HN Gopher] Open-sourcing AudioCraft: Generative AI for audio
___________________________________________________________________
Open-sourcing AudioCraft: Generative AI for audio
Author : iyaja
Score : 738 points
Date : 2023-08-02 15:36 UTC (7 hours ago)
(HTM) web link (ai.meta.com)
(TXT) w3m dump (ai.meta.com)
| peteforde wrote:
| I just ran all of the cited installation steps, which appear to
| have been successful... but I am now experiencing a profound
| sense of "now what?"
|
| There doesn't appear to be any new CLI executables installed, and
| the documentation links to an API but there's no clues on how to
| actually process a prompt.
|
| What am I missing? Alternatively, I wouldn't mind using it in a
| Notebook but so far this thread doesn't link to anything so
| ambitious (yet?)
| [deleted]
| javajosh wrote:
| You're not supposed to actually install it and use it, just
| comment on how cool and open Facebook is, especially in
| comparison to OpenAI. So, user error.
| parhamn wrote:
| Right its not like anyone has operationalized Llama 2 or
| there aren't hundreds of repos for inference servers and the
| likes. /s
| speedgoose wrote:
| The main gradio app has been moved to the demos folder.
| python demos/musicgen_app.py
|
| Otherwise you can check the jupyter notebooks in the same
| folder.
| peteforde wrote:
| Thanks! This will be even more helpful if you could share a
| hint about where this was installed to.
|
| I carefully went through the output generated by the "pip
| install -U audiocraft" command, and there were no clues
| provided.
|
| Disclosure: I am not a Python developer, so I apologize if
| this is a master-of-the-obvious question for Python folks.
| However, if there was ever a scenario where a line or two of
| post-install notes would be useful, it's stuff like this.
| speedgoose wrote:
| You may have to clone the repository to get the demos
| folder. Otherwise it's perhaps somewhere depending on how
| you use python (global and often broken environment,
| virtual environments, conda hell, etc...).
|
| I feel like Python folks are on average terrible at
| distributing software. So many projects have some python
| script to install the dependencies, still assume you use
| conda, or don't bother to specify the dependencies
| versions. Thankfully it's often the same patterns and after
| some time you understand what to do based on the error
| messages. But I wish they could use something like NPM or
| Cargo. Even something like Maven would be an improvement.
| moffkalast wrote:
| This is the default state of deep learning projects, everyone
| assumes only phd researchers will ever try it who already know
| how to use everything in the tool chain. What's happened with
| llama and other LLMs with codebases that actually work outright
| with one click when compiled is a pretty big outlier.
| wg0 wrote:
| "What a time to be alive!"
| s1k3s wrote:
| The demos are great. Could someone explain what's in it for Meta
| open sourcing all these models?
| jamil7 wrote:
| Not a fan of Meta, but haven't they generally been pretty
| forward with open sourcing their tech?
| bick_nyers wrote:
| If anything it makes them appear to be one of the best places
| to work at to do research. Could be them playing the long game.
| Saturdays wrote:
| theoretically what's in it for them is that people will build
| content faster and with less barriers for eventual consumption
| on their platforms
| ipaddr wrote:
| They haven't opened sourced much. Open models/closed weights
| restrictive non-commercial license is something I guess.
|
| They are trying to kill the market before they get left out.
| jdadj wrote:
| Commoditize Your Complement?
|
| https://gwern.net/complement
| maximus-decimus wrote:
| What is it a complement to though?
| jwestbury wrote:
| Content is a complement to social media.
| CrypticByte87 wrote:
| Meta has several of the biggest UGC platforms, and in this
| case the complement is content itself. Reels with
| autogenerated (and royalty free) background music is the
| obvious example but I'm sure there are more. Maybe creative
| for ads as well?
| jononor wrote:
| To Metaverse access. Filling the metaverse with engaging
| interactive 3D content is an insane job with 2020
| technology. It requires a huge amount of a range of skilled
| labor to create 3D models, soundtracks, NPC dialog, visuals
| et.c. to make a compelling experience. In 2030 that may
| have been reduced to that everyone with creativity and
| Internet access can do it. Sure, most of it will be silly
| things - but so is social media today, does not make it any
| less of a commercial success. And there will be be millions
| of semi-pro creators to create the things with higher
| production value, like with videography today.
| raincole wrote:
| In short term it's social media, because people will share
| whatever they generate on social media. But I don't think
| it's a very strong incentive to invest in AI for Meta.
| gostsamo wrote:
| They want to comoditize the offerings of OpenAI, Google, MS,
| and Apple. Also, they gain mindshare and good will after years
| of bad publicity. Some back contributions might help them
| improve the models for free.
|
| If they just keep their models, people won't be interested and
| will build over ChatGPT or Bard.
| kypro wrote:
| A competitive opensource project basically destroys the pricing
| power of all closed-source alternatives.
|
| If you're a company and wanted to integrate an LLM into your
| product if the choice is between several equally good models,
| but one is free and open-source which would you pick?
|
| Aside from keeping competition at bay, this move also gives
| Meta leverage because ecosystems are now being built around
| their projects. If these models see wide-scale adoption they
| could later launch AudioCraft+ as a licensed version with some
| extra features for example.
|
| Alternatively, they might offer support or hosting for their
| open source projects.
|
| Right now though I think the primary benefit of these open
| sourced models is to attract talent. If Meta is seen as one of
| the leaders in AI then researchers will want to work for them
| simply for the prestige.
|
| Arguably one of the reasons Meta has been behind so many
| awesome projects like PyTorch and React over the last decade
| was because they were seen as the cool place for recently
| graduated, but talented software engineers to work in ~2010.
| unnouinceput wrote:
| The same move that Microsoft did back in 90's to kill Netscape.
| Make your product the one available to masses, next generation
| of users will be using your product.
| zyang wrote:
| I was just thinking how Google made Android free to check
| Microsoft. This is Meta checking Google.
| roody15 wrote:
| Checking Google or OpenAI (or both?)
| ipaddr wrote:
| Checking OpenAI. Google is still playing checkers.
| conductr wrote:
| The fact you believe, rightly or wrongly, that meta is
| ahead of google on ai explains why meta would open source
| this. It's a good reputation to maintain.
| mcbuilder wrote:
| I just can't get how bad Google is doing. They have a ton
| of top researchers, papers, money, just no good LLMs.
| It's like OpenAI was first to the punch, and everyone
| else just saw $$$. Meta was smart to go down this open
| source road, as the masses will start training their
| llamas one way or another. Personally I believe the
| "intelligence" aspect will asymptote, so even having
| exclusive access to a "super AI" (i.e. hypothetical 1T
| parameter model like a GPT5) won't be that much of a step
| behind the lesser AIs, and as soon as you grant access to
| the masses they will start to use some transfer learning
| to make their "lesser" models better. AI applications
| though still need a lot of work. The models are smart or
| general purpose enough to be useful to the average person
| out of the box.
| rvnx wrote:
| The problem also is that Google is making lot of
| grandiose announcements about tools and models that
| nobody can see nor use. This is a serious credibility
| problem in the long-term.
| nprateem wrote:
| If people love hanging out with chatgpt or bard, they won't be
| wasting their precious little eyeballs on FB/Insta
| klapinat0r wrote:
| Somewhat relevant, Yann LeCun insisted the research should be
| open sourced. At least in an academic sense.
|
| He touches on it briefly in this podcast episode:
| https://www.therobotbrains.ai/who-is-yann-lecun
| hereonout2 wrote:
| Was asking myself the same earlier, I'm sure it is largely to
| do with publicity and the fact that selling these services is
| not their core business. At the very least releasing this stuff
| probably won't damage their core business but will take the
| sheen off of some other big names.
|
| I wondered though, generative AI is hurling us into a world
| where we'll need more mechanisms to sort real from fake,
| provenance will play a large part, and meta's platforms could
| be part of the answer. i.e. content linked to actual verifiable
| people.
| vasili111 wrote:
| They will own most popular open models so they can dictate the
| direction in open source AI.
| squidsoup wrote:
| The demos are, unsurprisingly, soulless muzak. This contributes
| nothing to our culture.
| TheAceOfHearts wrote:
| AudioGen seems really fascinating. I have some dumb questions.
|
| While the datasets used for training AudioGen aren't available,
| is there any kind of list where one can review the tags or
| descriptions of the sounds on which the model was trained?
| Otherwise how do you know what kinds of sounds you can reasonably
| expect AudioGen to be capable of generating? And what happens if
| you request a sound which is too obscure or something not found
| in the dataset?
|
| What are AudioGen's capabilities regarding spatial positioning?
| First example: can it generate a siren that starts in front and
| moves left to right and complete a full circle around the
| listener? Second example: can it do the same siren but on the Y
| axis, so it start at the front, it goes over the listener and
| then it goes under them to complete the circle?
| johnwheeler wrote:
| How does meta plan to make money from this open source?
| RaiyanYahya wrote:
| Would really be interested to run this locally
| pmarreck wrote:
| Wouldn't Pandora's possibly vast library associating textual
| descriptions with music be the ideal training data for something
| like this?
| chrisjj wrote:
| Yes... except that Pandora's library does not include the
| music.
| MadDemon wrote:
| We built a Mac/Windows app around the original MusicGen so people
| can experiment with it on their own machine with a simple UI
| (https://samplab.com/text-to-sample).
| ouraf wrote:
| It will be a licensing nightmare, just like Llama1 was.
|
| But clever devs can study it to make better software and pressure
| them for a better license on the next release. Worked for LLaMA 2
| Pannoniae wrote:
| "generating new music in the style of existing music" will
| probably be a huge field soon. I can't wait for it to happen,
| it's a low-cost way of producing even more music to listen to.
| gilmore606 wrote:
| > I can't wait for it to happen, it's a low-cost way of
| producing even more music to listen to.
|
| I can't really understand this. I'm a DJ and a huge music nerd,
| and I spend a lot of time every week discovering new music from
| the past 100 years and all over the world, and I'm constantly
| struck by _how much of it there is_. I've spent weeks just
| digging through psych-funk records from West Africa from the
| 1970s.
|
| How can you have the impression we're so desperate for more
| music that we need computer programs to generate it for us?
| Pannoniae wrote:
| Yes but it's not really fungible. My favourite artist is Fats
| Waller and they don't make anything like that anymore. Most
| people are only interested in _some_ category of music, not
| all of it.
| emporas wrote:
| There is a lot of human music for sure, great music from all
| eras, but just the other day i generated a song which was
| pure crystal harp. So, how many songs of crystal harp are out
| there? 1000 all n all? 10.000 maybe? Now i can generate one
| thousand crystal harp songs per day.
| lm28469 wrote:
| What if I generated a music of 12 billion farts, how many
| asses are out there ? 7 billions all in all maybe ? Now I
| can generate one billions of fart songs per day
| IAmGraydon wrote:
| Musician here. While I agree with you that there is a nearly
| endless heap of music to dig through, I think it's
| interesting to think about the possibility of hearing genre
| crossovers and styles that don't yet exist.
|
| As an aside, a lot of musicians seem to dislike this kind of
| technology, but I never saw music as a competition. I don't
| care if some inexperienced kid is generating bangers from his
| bedroom even though he can't play a single instrument. It's
| just something else to listen to. I write music for me.
| cooper_ganglia wrote:
| Music is self-expression. I don't always identify entirely
| with others. I always identify with my self. Having music
| generated for you on such a personalized level is an
| attractive prospect.
|
| I don't think this replaces "100% organic, human-made" music,
| though. I think there'll always be a reason to listen to
| music made by other people. But I think this changes the
| landscape of how and why people create music to begin with.
| It certainly will devalue existing music, since everyone has
| something they may prefer that they can generate instantly.
|
| I think generative AI is a terrible technology for artists
| who want to make money from their art, but in my personal
| opinion, I strive for a world where art isn't a transaction,
| but a gift of human expression and connection. A world where
| art is appreciated for the emotion, stories, and ideas it
| conveys rather than the monetary value it holds. Generative
| AI might disrupt the traditional economic models in the art
| world, but it also opens up new opportunities for creative
| exploration and personal expression. It's a challenging
| evolution, but one that could potentially democratize art,
| making it more accessible and personal than ever before!
| Bring on the Renaissance: Part 2!
| notmypenguin wrote:
| I don't agree with your definition of music. For me, as
| both a musician and a listener, music is communication
| between human beings via harmonic carrier waves. Using a
| machine to make word salad copies of existing communiques
| is literally just nonsense to me
| d0odk wrote:
| if you want to express yourself musically, then learn how
| to play an instrument and compose music
| rideontime wrote:
| > I strive for a world where art isn't a transaction, but a
| gift of human expression and connection. A world where art
| is appreciated for the emotion, stories, and ideas it
| conveys rather than the monetary value it holds.
|
| In a world where nobody is compensated for their art, the
| only people making art will be the ones privileged enough
| to have the means to do so for free. I don't see how this
| leads to "Renaissance: Part 2."
| pessimizer wrote:
| What's self-expressive about an algorithm that generates
| songs?
|
| Recorded music is the worst thing that happened to music.
| ckornby wrote:
| it can't be a "gift of human expression and connection" if
| A) a machine creates it and B) nobody but you ever hears it
|
| this isn't democratizing art, and i would argue it has
| nothing to do with art. it is giving us an endless faucet
| of content, but not art.
| lm28469 wrote:
| > I can't really understand this
|
| I came to understand a very large portion of the population
| just wants content, any type, any quality, to fill the void.
| They'll consume anything as long as it's new. Content to fill
| the empty vessels we became. Just look around, mainstream
| music, movies, podcasts, news, it's mostly mediocre, but it
| goes real fast, you get new mediocrity delivered every day
| dvngnt_ wrote:
| Frank Sinatra sings Lil Jon's "Get Low" -
| https://youtu.be/7zoQeH2wQFM
| cushpush wrote:
| Nice results so far. "Perfect for the beach" is a very funny
| description of music, because it has nothing to do with the
| acoustic qualities, so consider these descriptions to be
| anthropocentric! (As if they could be anything else) It is less
| about describing the actual sounds you want and more about
| describing the quality or vibe of the atmosphere. This is
| markedly different than incremental composition, maybe we can
| call it "composition by collage." _Puts on COLLAGE shirt like in
| Animal House_
| zapdrive wrote:
| Incoming strike by American Federation of Musicians in 3, 2,
| 1....
|
| How many jobs would this thing take away? One of the biggest time
| consuming in any video production is post production audio
| including background music, audio, Foley etc. This will automate
| almost of it!
| chrisjj wrote:
| This would take away all the jobs producing such low-quality
| low-fi artifact-laden background music... if any existed.
| praveenhm wrote:
| does this model help in TTS(text-to-speech), badly needed only
| free option is bark and tortise TTS right now.
| aedocw wrote:
| Coqui-TTS with vtck/vits is very good right now. Not as good as
| eleven labs or coqui studio, but for fast open TTS it's pretty
| good, in case you're not familiar with it.
|
| It will be great when there's eventually something open that
| competes with the closed models out there.
| praveenhm wrote:
| Excellent, I will take look into this.
| avereveard wrote:
| https://www.audiogen.co/ related? unrelated? big fight coming up?
| tibanne wrote:
| Can I generate a reading by someone if I have a lot of their
| voice samples with this? Or is there a better tool for doing such
| a thing?
| lm28469 wrote:
| elevenlab and dozen others already do that
| padjo wrote:
| Quite impressive although if these are the cherry picked examples
| the average output must be pretty weak! Nothing catchy about most
| of these examples and the reggae one is pretty lame.
| squidsoup wrote:
| "Imagine a professional musician being able to explore new
| compositions without having to play a single note on an
| instrument." A musician will always reach for an instrument as
| their compositional tool - keyboard and mouse producers are not
| musicians.
| brennanm wrote:
| This makes no sense.
|
| How is pressing a key on a piano different from pressing a key
| on an electronic piano?
| squidsoup wrote:
| I was referring to a computer keyboard, not an electric
| piano. I can't see how any musician would see this appealing
| as a compositional tool. Music is its own language -
| expressing a musical idea with a text prompt is antithetical
| to the process of making music.
| waffletower wrote:
| I think it is a mistake to acquiesce and let copyright owners
| bully AI model trainers over model data inputs. The endgame of
| this practice is a "pay per thought" society. This is separate
| from speculation regarding machine sentience -- as interfaces
| improve AI models will serve more and more as direct human
| extensions of mind. While copyright duration is a separate issue,
| and the current durations are appalling, copyright violations
| should focus strictly upon the output of models and how they are
| utilized. There are so many melodies in my head that I have not
| nor will I ever pay for, (some of which I would love to remove).
| AI models need also to have the same unfettered access to the
| commons as we do. Infringement occurs on the outputs --
| application of copyright restrictions on model inputs is a
| violation of Fair Use and a definite money grab.
| justinclift wrote:
| Wonder how far off the whole "generate music based on your
| existing music library" thing is going to be?
|
| That'll make musicians happy with big tech as well, just like
| artists are. *sigh*
| spudlyo wrote:
| Perhaps LoRA (Low-Rank Adaptation) training techniques could be
| used for these types of models, like they're currently being
| used with LLMs and latent text-to-image diffusion models.
| operator-name wrote:
| Sadly looks unlikely if the base model wasn't trained on
| vocals.
|
| > Mitigations: Vocals have been removed from the data source
| using corresponding tags, and then using a state-of-the-art
| music source separation method, namely using the open source
| Hybrid Transformer for Music Source Separation (HT-Demucs).
|
| > Limitations: The model is not able to generate realistic
| vocals.
|
| (https://github.com/facebookresearch/audiocraft/blob/main/mod
| ...)
|
| I suspect this was a combination of playing it safe and that
| the model isn't well architected to reproduce meaningful
| vocals.
| chrisjj wrote:
| Anyone having listened to this MusicGen's output samples would
| surely answer "a million miles".
|
| Seriously, you couldn't sell this output for a free mobile
| clicker game.
| zitterbewegung wrote:
| Why not do generate music you like which wouldn't need you to
| upload your library and would have RLHF baked in.
| ElFitz wrote:
| Something like the algorithm TikTok uses. First probing by
| offering a variety of content that should match based on what
| little information you have on the user (ip location, locale,
| etc).
|
| Then use the user's action to iteratively refine your
| classification, until you end up with something tailor-made.
| [deleted]
| LewisVerstappen wrote:
| The Record labels are _far_ , _far_ more litigious than the art
| community.
| PaulDavisThe1st wrote:
| They can't litigate a person doing this at home, and never
| redistributing.
|
| I suppose they might try, anyway.
| jsheard wrote:
| Is training a "pirate model" something you'd reasonably be
| able to do at home though, given the compute requirements?
| The analogous "image generation at home" is only possible
| due to a for-profit entity with significant resources
| choosing to (a) play fast-and-loose with the provenance of
| their training set and (b) giving away the resulting model
| for free, if the open source community had to train their
| models from scratch then as best as I can tell they would
| still be stuck in the dark ages generating vague goopy
| abominations.
| PaulDavisThe1st wrote:
| Currently, yes, available compute power @ home does
| indeed seem like a limitation. Whether that remains true
| going forward seems a little unclear to me.
| throwuwu wrote:
| You could take a model trained on CC content and then
| fine tune it on copyrighted material cheaply and quickly
| kmeisthax wrote:
| The RIAA pioneered copyright enforcement at the individual
| level back in the 2000s, they absolutely would try to sue
| downstream AudioCraft users.
| PaulDavisThe1st wrote:
| [flagged]
| kmeisthax wrote:
| Legal acquisition does not matter for AI training. If
| training is fair use then you can train on pirated
| material (e.g. OpenAI GPT). If it's not fair use then
| buying the material does not matter, you have to
| negotiate a specific license for AI training for each
| work in the training set, which is impractical at the
| scales most AI companies want to work.
| PaulDavisThe1st wrote:
| This seems to distort the issue a little bit.
|
| If you purchase the music, you have a (sometimes
| explicit, sometimes implicit) license to do certain
| things with the music, entirely independent of any
| concept of "fair use".
|
| The question is not "is training part of fair use?" but
| "is training part of, implicitly or explicitly, the
| rights I already have after purchase?"
|
| Given that "training" can be done by simple playing the
| music in the presence of a computer with its microphone
| turned on, it's not clear how this plays out legally.
| kmeisthax wrote:
| In the US, exceptions to copyright come across in two
| distinct bundles: first sale and fair use. They exist
| specifically because of the intersection between
| copyright law and two other principles of the US
| constitution:
|
| - First sale: The Takings Clause prohibits government
| theft of private property without compensation. Because
| copyright owners are using a government-granted monopoly
| to enforce their rights, we have to bound those rights to
| avoid copyright owners being able to just come and take
| copies of books or music you've lawfully purchased.
|
| - Fair use: The 1st Amendment prohibits government
| prohibitions on free speech. Because copyright owners are
| using a government-granted monopoly to enforce their
| rights, we have to bound those rights to avoid copyright
| owners being able to censor you.
|
| If you hinge your argument on "I bought a copy", you're
| making a first sale argument.
|
| Notably, first sale is limited to acts that do not create
| copies. This limit was established by the ReDigi case[0].
| Copyright doesn't care about the total number of copies
| in circulation, it cares about the right to create more.
| So an AI training defense based on first sale grounds
| would fail because training unequivocally creates copies.
|
| Fair use, on the contrary, does not care if you bought a
| copy of a work legally. It only cares about balancing
| your right to speech against the owners' right to a
| monopoly over theirs. And it has so far been far more
| resistant to creative industry attempts to limit
| exceptions to copyright - to the point where I would
| argue that "fair use" is an effective shorthand for any
| exception to copyright, including ones in countries that
| have no fair use doctrine and do not respect judicial
| precedent.
|
| The courts won't care how the training comes about, just
| if the act of training an AI alone[1] would compete with
| licensing the images used in the training set data.
|
| [0] https://en.wikipedia.org/wiki/Capitol_Records,_LLC_v.
| _ReDigi....
|
| [1] Notably, this is separate from the act of using the
| AI to generate new artistic works, which may be
| infringing
| PaulDavisThe1st wrote:
| It hasn't been established yet that a diffusion-model
| generated work is a copy or a derivative of any
| particular element of the training set.
| NegativeK wrote:
| The people above are arguing about being caught, not
| legality.
| manquer wrote:
| Even before streaming, You never "owned" any music
| legally [1], you merely owned a physical copy of a
| performance[3] of a song, that in no way gives you the
| right to make derivative works [2] automatically.
|
| Also it doesn't really matter on what the law says, RIAA
| in the last iteration, relied on the fact it you would
| rather pay a fine than be able to afford expensive
| lawyers to fight the specifics out in court on average.
|
| It was always about disproportionate ability to bring
| resources against individual "offenders" to create fear
| among everyone to deter "undesirable" forms of copying,
| not necessarily what the legal protections were.
|
| ---
|
| [1] Unless you specifically commissioned it under a
| contract which gave you the right
|
| [2] See recent cases including those related to Kris
| Kashtanova and Andy Warhol.
|
| [3] Not the song, just the performance, aka Taylor Swift
| version, for good explanation of how the rights are
| divvied up in the music industry a Planet Money series
| covers it well https://www.npr.org/sections/money/2022/10
| /29/1131927591/inf...
| JohnFen wrote:
| > that in no way gives you the right to make derivative
| works
|
| True, but only because you have that right anyway. I can
| do anything I like with copyrighted content I legally
| possess, as long as I don't distribute the results of my
| efforts.
| PaulDavisThe1st wrote:
| Establishing derivation is at the crux of all legal
| matters surrounding diffusion models. It has not yet been
| clearly established. If it is, then I'd agree with you.
| Until then, I think it's a bit more up in the air.
|
| Also, IIRC, RIAA did not bring many resources to bear
| against e.g. "home taping" itself, because they could
| essentially never know that it had occured. The
| overwhelming majority of their efforts went into trying
| to takedown people distributing multiple copies.
|
| The Kashtanova case does not cover derivation in any real
| way, but is really about copyright attribution choices
| between human and software.
|
| The Warhol case specifically tests a fair use claim, not
| a derivation claim.
| cmgbhm wrote:
| https://en.wikipedia.org/wiki/Audio_Home_Recording_Act
|
| https://en.wikipedia.org/wiki/Home_Taping_Is_Killing_Musi
| c
|
| Recording industries have fought end user reproduction
| often. They've fought sampling battles.
|
| Go after the pocketbooks and go after the technology
| waves. If there's a derivative argument they can make,
| they will.
| [deleted]
| throwaway290 wrote:
| They should start with AudioCraft itself, conceptually
| it's derivative work and it doesn't matter if it's "open
| source" or not. Try throwing in someone's sample in a
| song and publish it saying "no copyright infringement
| intended and I totally don't make any money from it"...
| If it becomes popular, see how long it stays up until
| DMCA takedown. And we know this dataset is already
| popular.
| PaulDavisThe1st wrote:
| > and publish it
|
| This is precisely the opposite of the context I was
| remarking on.
| Workaccount2 wrote:
| Ugh, I dread having to listen to everyone's hyper personal
| music because they swear up and down to the point of tears that
| _" IT'S THE BEST SONG EVER CREATED! EVER!!!"_, while the
| constantly prod for you to affirm how amazing the song is.
|
| Bruh, music is subjective as hell, and I can already tell I
| hate this song.
| davidw wrote:
| Is there a way to try this out? I didn't see one, but didn't look
| too hard.
| smallerfish wrote:
| Yes. Installation instructions on the front of the repo, then
| click on the model readme for sample getting started code (10
| lines of python and you get output.)
| gavman wrote:
| > MusicGen, which was trained with Meta-owned and specifically
| licensed music, generates music from text-based user inputs,
| while AudioGen, which was trained on public sound effects,
| generates audio from text-based user inputs.
|
| Meta is really clearly trying to differentiate themselves from
| OpenAI here. Open source + driving home "we don't use data we
| haven't paid for / don't own".
| jstummbillig wrote:
| Yes. Meta is in the business of commanding as much of peoples
| time as possible. AI is more or less the biggest danger to this
| model (apart from legislation, theoretically, but let's not kid
| ourselves). Making AI a commodity is in their very interest.
| JeremyNT wrote:
| The fact that Meta is able to lie and call their restrictive
| licensing open source is nearly as misleading as "OpenAI."
|
| We need to do better than to repeat these claims uncritically.
| The weight licenses are not "open source" by any useful
| definition, and we should not give Meta kudos for their
| misleading PR (especially considering that they almost surely
| ignored any copyright when training these things - rules for
| thee, but not for me).
|
| "Not as closed as OpenAI" is accurate, but also damning with
| faint praise.
| voz_ wrote:
| Can you chill? It's def open source
| mkl wrote:
| The source code is, as it's MIT, but the weights are not,
| as they're CC-BY-NC:
| https://github.com/facebookresearch/audiocraft#license
| Filligree wrote:
| So I can build a business on it, then?
| chaxor wrote:
| Research does exist you know. This is immensely helpful
| for a huge number of people in academia.
|
| If you want to build a company, perhaps you should do
| what everyone in the industry has done for millennia,
| copy the movements performed and optimize them while
| doing so.
| __loam wrote:
| I believe Meta has explicitly said that you can, but
| that's not what open source means and the model isn't
| open source.
| mkl wrote:
| Meta says to _imagine_ you can: "Imagine a professional
| musician being able to explore new compositions without
| having to play a single note on an instrument. Or an
| indie game developer populating virtual worlds with
| realistic sound effects and ambient noise on a shoestring
| budget. Or a small business owner adding a soundtrack to
| their latest Instagram post with ease."
|
| In reality, you can't, as they licensed the weights for
| noncommercial use only:
| https://github.com/facebookresearch/audiocraft#license
| version_five wrote:
| about: pytorch @ fb.
| j_maffe wrote:
| Just some general piece of advice: it's not productive to
| constantly be giving out the worst criticism you possibly can
| when someone does something that's not terrible but still
| unacceptable. Doing so just tells the companies that nothing
| satisfies the community and that they should stop trying.
| Instead, it's better to mention what they did right and point
| to how they can make it better.
| [deleted]
| samstave wrote:
| They are doing PR damage control with an influx of AI stuffs
| due to the ridicule of metaverse and the recent revelations of
| threads (for which they are playing the long AI game) -- [are
| not concrened about all threads and IG and other accounts being
| linked via their internal LLMs we will never hear about?
| ChildOfChaos wrote:
| It's likely partly a PR/branding exercise as well.
|
| In the new world that Meta sees, of VR/AR and AI, Meta is in a
| position already were people don't want them to have much power
| in this world, because they don't trust them over privacy etc,
| meta is trying to pivot to become more trustworthy so they make
| genuine moves in this space.
| smoldesu wrote:
| That, or this is an ongoing research lab (FAIR) that has
| existed for ~half a decade and has advanced the state-of-the-
| art in AI further than Apple, Microsoft and Google combined.
| __loam wrote:
| I would be pretty shocked if meta were that far ahead of
| all 3 of those companies, all of which are also spending a
| fuck load on internal AI research.
| smoldesu wrote:
| If all three of those companies have something to show
| for their research, none of it is at the scale or level
| of accessibility Pytorch, Llama and now Audiocraft offer.
| upshide wrote:
| [dead]
| itsyaboi wrote:
| Bully "Open"AI into rebranding.
| [deleted]
| scrum-treats wrote:
| > "Meta is really clearly trying to differentiate themselves
| from OpenAI here. Open source + driving home "we don't use data
| we haven't paid for / don't own"."
|
| Isn't Meta settling lawsuits for this right now? In addition to
| violating user privacy (another lawsuit)...
|
| Meta is attempting to destroy competition; that's it. Similar
| to how they paid a fortune to lobby against Tiktok for the
| exact reasons Meta is under active investigation (again). The
| irony.
| croes wrote:
| "If we don't win here, then at least we'll kick their lawn to
| pieces."
| kmeisthax wrote:
| This is purely a function of everyone remembering the RIAA's
| decade-long campaign to prevent people from taking the music
| they had rightfully stolen. As far as I'm aware LLaMA was
| trained on "publicly available data"[0], not "licensed data".
|
| Furthermore, MusicGen's weights are licensed CC-BY-NC, which is
| effectively a nonlicense as there is no noncommercial use you
| could make of an art generator[1]. This is not only a 'weights-
| available' license, but it's significantly more restrictive
| than the morality clause bearing OpenRAIL license that
| Stability likes to use[2].
|
| [0]
| https://github.com/facebookresearch/llama/blob/main/MODEL_CA...
|
| [1]
| https://github.com/facebookresearch/audiocraft/blob/main/LIC...
|
| [2] These are also very much Not Open Source(tm) but the
| morality clauses in OpenRAIL are at least non-onerous enough to
| collaborate over.
| Blackthorn wrote:
| > there is no noncommercial use you could make of an art
| generator
|
| I'm sorry, what?
| eropple wrote:
| _> MusicGen 's weights are licensed CC-BY-NC, which is
| effectively a nonlicense as there is no noncommercial use you
| could make of an art generator_
|
| How do you figure? Have you never just...made stuff to make
| stuff?
| analognoise wrote:
| I think the key word there is "noncommercial".
| dragonwriter wrote:
| Yes, but you can easily make noncommercial use of an art
| generator.
|
| Obviously, you can't host a commercial art generation
| service with a noncommercial-use license, and (insofar as
| art produced by a generator is a derivative work of the
| model weights, which is a controversial and untested
| legal theory) you can't make commercial art with a
| noncommercial license, but not all art is commercial.
| kmeisthax wrote:
| "Noncommercial art" is not a thing in the eyes of the
| law. Even if you don't intend to make money the law still
| considers the work itself to be commercial. That's why
| CC-BY-NC has to have a special "filesharing is non-
| commercial" statement in it, because people have made
| successful legal arguments that it is.
|
| You're probably thinking of "not charging a fee to use",
| which is a subset of all the ways you can monetize a
| creative work. You can still make money off of AudioCraft
| by just hosting it with banner ads next to the output.
| Even a "no monetization" clause[0] would be less onerous
| than "noncommercial use only", because it'd at least be
| legal to use AudioCraft for things like background music
| in offices.
|
| [0] Which already precludes the use of AudioCraft music
| on YouTube since you can't do unmonetized uploads anymore
| dragonwriter wrote:
| > "Noncommercial art" is not a thing in the eyes of the
| law
|
| The definition of "NonCommercial", the oddly capitalized
| term of art in the license, is not a matter of general
| law, it is a matter of the license, which defines it as
| "not primarily intended for or directed towards
| commercial advantage or monetary compensation. For
| purposes of this Public License, the exchange of the
| Licensed Material for other material subject to Copyright
| and Similar Rights by digital file-sharing or similar
| means is NonCommercial provided there is no payment of
| monetary compensation in connection with the exchange."
|
| > Even if you don't intend to make money the law still
| considers the work itself to be commercial.
|
| Even if you _do_ make money, if the use is "not primarily
| intended" for that purpose, it is "NonCommercial" in the
| terms of the license.
|
| > That's why CC-BY-NC has to have a special "filesharing
| is non-commercial" statement in it, because people have
| made successful legal arguments that it is.
|
| It has the filesharing term in it because it permits that
| particular exchange-of-value as a _primary purpose_.
|
| > Even a "no monetization" clause would be less onerous
| than "noncommercial use only"
|
| How would a clause that prohibits monetization entirely
| be less onerous than one which prohibits it only as the
| primary intent of use?
|
| > it'd at least be legal to use AudioCraft for things
| like background music in offices.
|
| It is legal to use it for that purpose (in a for-profit
| enterprise, I suppose, one might make an argument that
| _any_ activity was ultimately primarily directed at
| "commercial advantage", but in a government or many
| nonprofit environments, that wouldn't be the case.)
| vel0city wrote:
| In their example audio clips they have a "perfect for the
| beach" audio track. With your understanding of the NC
| license, would a resort or private beach club be able to
| play a similar generated music track at their poolside
| bar or something along those lines? Their primary
| intention of the bar isn't to play the music, its just an
| additional ambiance thing; they're trying to sell drinks
| and have guests pay membership fees, people aren't really
| coming because of the background music.
|
| I realize, this isn't legal advice, YMMV, etc.
| Blahah wrote:
| Yes it is. Art that I make for my own enjoyment is
| noncommercial. Art that I make to explain concepts to my
| son is noncommercial.
| kmeisthax wrote:
| In copyright law the use of the work itself is considered a
| commercial benefit, so "noncommercial use" is an oxymoron.
| Consider these situations:
|
| - If I use AudioCraft to post freely-downloadable tracks on
| my SoundCloud, I still get the benefit of having a large
| audio catalog in my name, even if I'm not selling the
| individual tracks. I could later compose tracks on my own
| and ride off the exposure I got from posting
| "noncommercially".
|
| - If I run AudioCraft as a background music generator in my
| store, I save money by not having to license music for
| public performance.
|
| - If I host AudioCraft on a website and put ads on it, I'm
| making money by making the work available, even though I'm
| not charging a fee for entry.
|
| I suspect that a lot of people reading this are going to
| have different arguments for each. My point is that if you
| don't think that all of these situations are equally
| infringing of CC-BY-NC, then you need to explain _why_ some
| are commercial and some are not. Keep in mind that every
| exception you make can be easily exploited to strip the NC
| clause off of the license.
|
| If you're angry at the logic on display here, keep in mind
| that this is how _judges_ will construe the license, and
| probably also how Facebook will if you find a way to make
| any use of their AI. The only thing that stops them from
| rugpulling you later is explicit guidance in CC-BY-NC.
| Unfortunately, the only such guidance is that they don 't
| consider P2P filesharing to be a commercial use.
|
| So, absent any other clarifications from Facebook, all you
| can do without risking a lawsuit is share the weights on
| BitTorrent.
|
| EDIT: And yes, I _have_ made stuff just to make stuff. I
| license all of that under copyleft licenses because they
| express the underlying idea of 'noncommercial' better than
| actual noncommercial clauses do.
| Tao3300 wrote:
| > if you don't think that all of these situations are
| equally infringing of CC-BY-NC, then you need to explain
| why some are commercial and some are not. Keep in mind
| that every exception you make can be easily exploited to
| strip the NC clause off of the license.
|
| You're right: those are all equally infringing CC-BY-NC.
| I don't see a problem.
| dragonwriter wrote:
| > My point is that if you don't think that all of these
| situations are equally infringing of CC-BY-NC, then you
| need to explain why some are commercial and some are not.
|
| What "NonCommercial" means in the license is explictly
| defined _in_ the license, and if you think either those
| examples, or more to the point, every possible use ever
| so as to render 'NonCommercial' into 'no use' as you have
| claimed, _you_ need to make that argument, based on the
| definition in the license, not some concept of what might
| be construed as commercial use by general legal
| principles if the license used the term without its own
| explicit definition.
| NegativeK wrote:
| Is listening at home a violation of NC? That's what I've
| interpreted as its intent.
| stale2002 wrote:
| This is a weird comment.
|
| Do you think that non commercial use simply doesn't exist
| or something?
|
| Because non commercial use isn't some crazy concept. It
| is a well established one, that doesnt disclude literally
| everything.
|
| Also, you are ignoring the idea that Facebook will almost
| certainly not sue anyone for using this for any reason,
| except possibly Google or Apple.
|
| So if you aren't literally one of those companies you
| could probably just use it anyway, ignore the license
| completely, and have zero risk of being sued.
| elondaits wrote:
| The issue with "non commercial" is that no, it's not well
| established. Licenses with a NC clause are so problematic
| to be practically useless. If you just want to use
| something at home privately you don't need a CC
| license... a CC license is for use and redistribution.
|
| http://esr.ibiblio.org/?p=4559
| robertlagrant wrote:
| What about playing the music in a government building as
| elevator music, for example?
| Tao3300 wrote:
| I miss that blog. It was a little crazy and the comments
| were a flame war shitshow, but man it was fun to read
| sometimes. Even if I vehemently disagreed, it got me
| thinking.
|
| Whatever happened to esr? Did he just get too paranoid
| and clam up?
| pbhjpbhj wrote:
| >If you just want to use something at home privately you
| don't need a CC license... //
|
| I presume you mean in USA, because in UK you don't have a
| general private right to copy. Our "Fair Dealing" is
| super restrictive compared to Fair Use.
| kmeisthax wrote:
| Funnily enough in the UK they actually tried to fix this.
| The music industry argued that the lack of a private
| copying levy made legalized CD ripping into government
| confiscation of copyright ownership... somehow. The UK
| courts bought this, so now the UK government is
| constitutionally mandated to ban CD ripping, which is
| absolutely stupid.
| kmeisthax wrote:
| Noncommercial use is not well established in copyright
| law, which is the law that actually matters. I know other
| forms of law actually do establish noncommercial and
| commercial use standards, but copyright does not
| recognize them.
|
| As for "Facebook won't sue"? Sure, except we don't have
| to worry about just Facebook. We have to worry about
| anyone with a derivative model. There's an entire
| industry of copyleft trolls[0] that could construct
| copyright traps with them.
|
| Individuals can practically ignore NC mainly because
| individuals can practically ignore most copyright
| enforcement. This is for the same reason why you can
| drive 55 in a 30mph zone and not get a citation. It's not
| that speeding is now suddenly legal, it's that nobody
| wants to enforce speed limits - but you can still get
| nailed. The moment you have to worry about NC, there is
| no practical way for you to fit within its limits.
|
| [0] https://www.techdirt.com/2021/12/20/beware-copyleft-
| trolls/
| dragonwriter wrote:
| > Noncommercial use is not well established in copyright
| law, which is the law that actually matters.
|
| No, for "NonCommercial", what actually matters is the
| explicit definition in the license.
| wpietri wrote:
| What's your evidence for this bit?
|
| > this is how judges will construe the license
| ericpauley wrote:
| My understanding (IANAL) [1] is that copyright licenses have
| no say on the output of software. Further, CC licenses don't
| say anything about _running_ or using software (or model
| weights). It 's therefore questionable whether the CC-BY-NC
| license actually prevents commercial use of the model.
|
| [1] https://opensource.stackexchange.com/questions/12070/allo
| wed...
| cosmojg wrote:
| You're correct, but no one has had the balls (or the
| lawyers) to clarify this in court yet. Expect to see
| hosting providers complying with takedown requests for the
| foreseeable future.
| mcbits wrote:
| I don't remember the details (or outcome) but there was a
| lawsuit a few years ago involving CAD or architecture
| software and whether they could limit how the output
| images were used because they were assemblages of clipart
| that the company asserted were still protected by
| copyright. Something like that. A lot of "AI" output
| potentially poses a similar issue, just at a far more
| granular level.
| indymike wrote:
| Hosting providers *have* to comply with takedown requests
| to maintain safe harbor.
| Tepix wrote:
| You're wrong because software, as you describe it, includes
| the "cp" command which creates a perfect copy.
| ericpauley wrote:
| As sibling noted, we're talking about the impact of a
| software's license on use of _its_ output.
|
| I suppose your point would stand if the software were a
| quine?
| tikhonj wrote:
| The copyright license _of the cp code itself_ has no
| bearing on the copyright of what you produce (well, copy)
| with cp.
| robertlagrant wrote:
| That's not the point they're making. They're replying to
| their parent comment.
| rvnx wrote:
| Google is running on "publicly available data", not "licensed
| data"
| schleck8 wrote:
| > as there is no noncommercial use you could make of an art
| generator
|
| r/stablediffusion gives you a hundred examples daily of
| people just having fun and not thinking of monetizing their
| generations
| agilob wrote:
| Goddamn, Facebook being the good guy...
| version_five wrote:
| They're not, they're playing a longer Microsoft style game to
| corrupt the meaning of open source, and releasing models
| under their terms to undermine competitors.
| deepvibrations wrote:
| Nah, this is just the modern tech playbook: First you open
| source stuff, then you can monitor all the related
| development happening and whenever you see areas of
| interest/popularity, you simply clone the functionality or
| buy out whatever entity is building that interesting stuff.
| archontes wrote:
| You don't own data. You can sometimes copyright data.
|
| https://www.americanbar.org/groups/science_technology/public...
| naillo wrote:
| I wish people made unconditional predictive models for music
| instead of text-to-music ones. Would be so cool to give an input
| 'inspiration' track that it 'riffs' a continuation to. That's
| usually what I want, just continue this track it's too short
| that's what I want to hear more of. (That said this is super cool
| though.)
| bottlepalm wrote:
| Anyone feel like with the flood of AI generated content there's a
| risk of the past being 'erased'. Like in 10 years we won't be
| able to tell if any information from the past is real or fake -
| sounds, pictures, videos, etc.. Like we need to start
| cryptographically signing all content now if there's any hope of
| being able to verify it as 'real' 10 years from now.
| JohnFen wrote:
| Yes, this is one of my concerns about all of this. The danger
| is real.
| vagab0nd wrote:
| Even with digital signatures, there are limits to what we can
| really verify.
|
| We'll likely be able to verify whether an entity is a real
| human, using some kind of "proof of humanity" system.
|
| We will have cameras/mics with private keys built-in. The
| content can be signed as it's produced. But in this case,
| what's stopping me from recording a fake recording?
|
| Maybe it's a non-issue. We used text to record history and
| we've been able to manipulate that since, well, forever.
| crazygringo wrote:
| No. We've had photo and audio manipulation for many decades
| now. For a long time now, we've had to separate out what's
| credible from what's bullshit.
|
| Fortunately, it's pretty simple in real life. We have certain
| publications and sources we trust, whether they're the NYT or a
| respected industry blog. We know they take accurate reporting
| seriously, fire journalists who are caught fabricating things,
| etc.
|
| If we see a clip on YouTube from the BBC, we can trust it's
| almost certainly legit. If it's some crazy claim from a rando
| and you care whether it's real, it's easy to look it up to see
| if anyone credible has confirmed it.
|
| So no, no worry at all about the past being erased.
| probablynish wrote:
| Seems like it might now become much easier to post a clip on
| YouTube that looks like an authentic BBC clip, logo and all.
| If generative AI gets that good, how will you be able to tell
| whether a particular piece of media comes from a trusted
| source?
|
| Might not be possible on platforms - only if it's posted on a
| trusted domain.
| crazygringo wrote:
| Easy, is it on the official BBC YouTube channel or not?
|
| That's the entire point of having trusted sources. Regular
| people can post whatever fake things they want on their own
| accounts; they can't post to the BBC's YouTube channel or
| to the NYT's website.
| minsc_and_boo wrote:
| Yep, every time technology shifts, reputation systems shift
| in response.
|
| This goes all the way back to yellow news with newspapers:
| https://en.wikipedia.org/wiki/Yellow_journalism
| ysleepy wrote:
| I don't agree. With ML tools it is possible to make sweeping
| changes to images and text that are often impossible to
| detect. combined with the centralisation of most online
| activities, large players could alter the past.
|
| Imagine facebook decides to subtly change every public post
| and comment to show some particular person or cause in a
| better light.
| crazygringo wrote:
| If one "large player" like the NYT decides to "alter the
| past", you can compare with the WaPo or any other
| newspaper. You can compare with the Internet Archive. You
| can compare with microfiche. These aren't "impossible to
| detect", they're trivial to detect if you bother to
| compare.
|
| We have tons of credible archived sources owned by
| different institutions. And these sources are successful in
| large part due to their credibility and trustworthiness.
|
| It's just not economically rational for any of them to
| start "altering the past", and if they did, they'd be
| caught basically immediately and their reputation would be
| ruined.
|
| This isn't an ML/tooling question, it's a question of
| humans and reputation and economic incentives.
| jononor wrote:
| The suggested large player was Facebook and Facebook
| posts. Which trustworthy independent sources of
| authenticity do we have for that? I do not think those
| you mention reach inside their walled garden?
| crazygringo wrote:
| First, why would Facebook do that? What economic
| incentive would there ever be, that would outweigh the
| loss of trust and reputation hit that would ensue?
|
| Second, people take screenshots of Facebook posts _all
| the time_. They 're everywhere. If you suddenly have a
| ton of people with timestamped screenshots from their
| phones that show Facebook has changed content, that's
| exactly the kind of story journalists will pounce on and
| verify.
|
| The idea that Facebook could or would engage in
| widespread manipulation of past content and not get
| caught is just not realistic.
| ysleepy wrote:
| You seem eager to exclude the possibility.
|
| Maybe it is improbable, but there now is the technical
| possibility which was not there before.
|
| It is valuable to explore that possibility and maybe even
| work to prevent such a use.
|
| I would be interested in a ledger of cryptographically
| signed records of important public information such as
| newspapers, government communication and intellectual
| discourse.
|
| Your argument that large social media will behave
| rationally is not backed up by reality. Consider Musk and
| Twitter.
| oceanplexian wrote:
| > If one "large player" like the NYT decides to "alter
| the past", you can compare with the WaPo or any other
| newspaper. You can compare with the Internet Archive. You
| can compare with microfiche. These aren't "impossible to
| detect", they're trivial to detect if you bother to
| compare.
|
| Detection doesn't really matter, because people are too
| lazy to validate the facts, and reporters are not
| interested in reporting them. AI is simply another tool
| to manipulate people, like Wikipedia, Reddit.com,
| Twitter, or any other BS psuedo-authority. Think someone
| will actually crack open a book to prove the AI wrong?
| Not a chance.
| crazygringo wrote:
| > _and reporters are not interested in reporting them_
|
| You really think that if the NYT started altering its
| past stories, other publications would just... ignore it?
|
| It would be a front-page scandal that the WaPo would be
| delighted to report on. As well as a hundred other news
| publications.
|
| Thankfully.
| amelius wrote:
| > We've had photo and audio manipulation for many decades
| now. For a long time now, we've had to separate out what's
| credible from what's bullshit.
|
| The difference is that the floodgates are being opened.
| conductr wrote:
| At a time when "people" seem easily manipulated and focused
| on their fully believing their personal feeds of curated
| outrage. They often don't apply the screens/filters they
| should be because of the apparent social proofs, trust, and
| biases they have with the content. Contemporary journalists
| hardly do any fact/source checks as it is. So they'll begin
| reporting on some of this, giving it further credibility
| and it's just a downward spiral. So, more of the same, yay!
| crazygringo wrote:
| It doesn't matter though. Most of the internet is already
| probably mostly SEO blogspam, just like spam e-mail already
| outweighs legitimate e-mail for a lot of (most?) people.
| But nobody cares because it gets filtered out in the ways
| people actually navigate.
|
| We have lots of tools to fight spam, and there's no reason
| to believe they won't continue to evolve and work well.
| pessimizer wrote:
| > We've had photo and audio manipulation for many decades
| now.
|
| We haven't been able to generate 1,000 different forged
| variants of the same speech in a day before.
|
| > We have certain publications and sources we trust, whether
| they're the NYT or a respected industry blog.
|
| We can't even be sure that most of these aren't changing old
| stories, unless we notice and check archive.org, and they
| haven't had them deleted from the archive. The NYT has
| blockchain verification, but the reason nobody else does is
| because no one else wants to. They want to be free to change
| old stories.
| crazygringo wrote:
| > _but the reason nobody else does is because no one else
| wants to. They want to be free to change old stories._
|
| You're wildly assuming a motive with zero evidence.
|
| No, the reason companies aren't building blockchain
| verification of their stories is simply because it's
| expensive and complicated to do, for literally zero
| commercial benefit.
|
| Archive.org already will prove any difference to you, and
| it's much easier to use/verify than any blockchain
| technology.
| strikelaserclaw wrote:
| Most people these days interact with news through comments,
| if comments looks legit, a lot of people assume the source is
| legit. Imagine a world in which a fake video has the bbc logo
| on it and ai generated comments act if they are discussing
| the video but they subtly manipulate, like 60% of the
| comments advocate a certain view point and 40% are random
| memes, advocate against it etc... The average person would
| easily be fooled.
| oceanplexian wrote:
| You basically described Reddit. Don't even need an AI, all
| you need is moderator powers and a bunch of impressionable
| young people.
| randcraw wrote:
| With 90% of human generated media content being forgettable
| within weeks of publication, and AI not yet capable of matching
| even _average_ human content (much less pro level), it'll be
| some time before we have to worry about AI overwhelming most
| media content and erasing the works of memorable human authors.
| og_kalu wrote:
| >and AI not yet capable of matching even average human
| content (much less pro level)
|
| Yeah this is not true. Sota Text, Image generation is well
| above average baselines. You can certainly generate
| professional level art on Midjourney
| squidsoup wrote:
| Commercial art and Art are not the same thing.
| seydor wrote:
| The past ended in 2022
| bottlepalm wrote:
| Agree. Any video/image/text created post-2022 is now suspect
| of being AI generated (even this comment). And without any
| 'registering' of pre-2022 content, we can easily lose track
| and not really know what from pre-2022 is authentic or not.
|
| Maybe it's not a big deal to 'lose' the past, maybe landfills
| will be mined for authentic content.
| shon wrote:
| This ^^
| apabepa wrote:
| Or is the past endlessly rehashed with AI generated content?
| jeffwass wrote:
| I've been wondering about this and real video evidence (eg
| dashcam or cctv) being refuted in court for inability to show
| it's not deepfaked.
| russdill wrote:
| If you're watching a movie or TV show, a vast majority of the
| sounds you are hearing are not "real". Has that bothered you
| before?
| swores wrote:
| That seems as pointless a question as suggesting that
| enjoying TV shows means you shouldn't care if everyone in
| your life constantly lies to you.
| Ylpertnodi wrote:
| >the stuff i hear is real. Perhaps you meant 'are not from
| the actual source you think they are'?
|
| *my favorite is always the nightclub scene that goes real
| quiet when the actors act using their voices (which are real,
| but may be dubbed in afterwards).
| Culonavirus wrote:
| Here's a different question: Can you use the audio output this
| produces for anything else other than "research purposes"?
| [deleted]
| sangnoir wrote:
| You can - as long as it's not commercial. It's a broad
| definition, but a good eule of thumb is if you're not directly
| making money out of the generated audio. They may still cone
| for you if you're making money indirectly, so consult a lawyer.
| blackkettle wrote:
| I can see some fantastic uses for this in generating complex
| acoustic environments to layer over TTS or real recordings
| for speech-to-text model training. I wonder if that is
| occupying some kind of gray-area. For example you have
| 1000hrs of clean speech from the librispeech corpus. It would
| be trivial to use this tool and available weights to generate
| background noise, environmental noise and the like, and then
| layer this with the clean speech to cheaply train a much more
| robust model. The environmental audio you create would never
| be directly shared or sold, but it would impact the overall
| quality of the STT model that you train from the combined
| results.
| westurner wrote:
| Generative AI > Modalities > [Music,]:
| https://en.wikipedia.org/wiki/Generative_artificial_intellig...
| dragonwriter wrote:
| The license of the model weights is CC-BY-NC, which is not an
| open source license.
|
| The code is MIT, though.
| archontes wrote:
| It's unlikely that model weights can be copyrighted, as they're
| the result of an automatic process.
| dragonwriter wrote:
| > It's unlikely that model weights can be copyrighted, as
| they're the result of an automatic process.
|
| _If_ they can't _for that reason alone_ , then the model is
| a mechanical copy of the training set, which may be subject
| to a (compilation) copyright, and a mechanical copy of a
| copyright-protected work _is_ still subject to the copyright
| of the thing of which it is a copy.
|
| OTOH, the choices made _beyond_ the training set and
| algorithm in any particular training may be sufficient
| creative input to make it a distinct work with its own
| copyright, or there may be some other basis for them not
| being copyright protected. But the mechanical process one
| alone just _moves_ the point of copyright on the outcome, it
| doesn't eliminate it.
| vasili111 wrote:
| Is there place where I can check how it works? Like give my input
| and get output audio?
| operator-name wrote:
| The model cards from the repo[0] link to Colab and HF spaces.
|
| [0]: https://github.com/facebookresearch/audiocraft#models
| emporas wrote:
| Audiocraft+, don't forget the plus, on github has a collab
| notebook based on audiocraft, and a webui to use. It is pretty
| awesome!
| smallerfish wrote:
| This is Spotify's route to profitability - the Netflix model of
| generating their own "content" (/music), and not having to pay
| the labels. Premium plans for us music nerds who want a human at
| the other end, regular plans for plebs who just want to fill the
| silence with something agreeable.
| CharlesW wrote:
| Although I think AI-generated and AI-augmented (using voice
| cloning, etc.) artists are a given, for Spotify to stop paying
| labels they'd have to be able to remove all non-Spotify content
| from their streaming catalog. That doesn't seem like a
| possibility in our lifetimes. (Also, Spotify hasn't even been
| able to build a sustainable business on podcasts, which they
| copy to their closed platform for free.)
|
| It's an interesting thought experiment, though. I can imagine
| that "environmental audio" companies like Muzak have about 5
| years left before they either adapt or die. What other kinds of
| companies are in trouble?
| smallerfish wrote:
| Their current pay structure is royalties, i.e. per listen. If
| they can route their audience to mostly AI generated content
| in time (say, 5-10 year transition), and it's just as good
| for most people, then they can negotiate much lower prices
| with the labels. We all grumble about Netflix being full of
| junk, but most of us are still subscribers, despite a sparse
| catalog of big name movies.
| hobofan wrote:
| > then they can negotiate much lower prices with the labels
|
| Or alternatively, if the labels are not stupid, they'll
| negotiate for a higher price per listen (or similar), as
| they are still as essential to the service as before.
| jeffbee wrote:
| The fact that it generates a song for the prompt "Earthy tones,
| environmentally conscious ... organic instrumentation" goes a
| long way to proving that English words no longer mean anything
| particularly.
| chrisjj wrote:
| That sort of presumes those words had any effect on the output.
|
| We might know more had it generated a song as you said, but in
| fact it generated only an instrumental.
| TrackerFF wrote:
| I've played guitar 25 years, and it's funny how the music
| community has been using all kind of words to describe music or
| tone. Describing certain tone as "hairy", "mushy", "wooly",
| "airy", "buttery", etc. is just very common.
| jeffbee wrote:
| Sure, there's jargon. But these words don't describe the
| music. They describe words associated with the kinds of
| people who would listen to it (according to the biases of the
| language model). As a description of the music, it's
| meaningless. If a person was asked to name some
| "environmentally conscious" music they could just as easily
| veer over to hardcore straight edge.
| camillomiller wrote:
| I think "song" should go in quotes
| bestcoder69 wrote:
| Anyone know if there are ways, as-is, to speed this up on Apple
| Silicon?
|
| This setup takes 5 minutes: - Mac Studio M1 Max
| 64GB memory - running musicgen_app.py - model:
| facebook/musicgen-medium - duration: 10s
| mk_stjames wrote:
| I see from the musicgen.py- >.if
| torch.cuda.device_count(): >. device = 'cuda' >.
| else: >. device = 'cpu'
|
| So pytorch will fall back to CPU on a Apple Silicon. Ideally it
| would use Metal for acceleration (MPS) instead of just plain
| 'CPU', but if you replace CPU with MPS you'll probably run into
| a few bugs due to various Autocast errors and I think some
| other incompatibility with Pytorch 2.0.
|
| At least that is what I ran into last time I tried to speed
| this up on an M1. It's possible there are fixes.
| bestcoder69 wrote:
| Same here (mps errors). I tried after the initial musicgen
| release.
|
| I'll have to check again, but I remember AFAICT my hardware
| wasn't getting saturated, so maybe there's headroom for mac
| cpu performance. And of course in the meantime I'll be
| refreshing the ggml github every day
| CSSer wrote:
| Does anyone else hear a kind of background static in these
| samples? It almost sounds like part of the track is more
| compressed in terms of dynamic range than other parts, which
| doesn't make any sense to me. I'm trying to decide if this is my
| own confirmation bias at work or not.
| aimor wrote:
| This is great, I've been wanting sound effect generation for
| years. I spent a lot of time trying to get WaveNet working well,
| eventually just dropped the project after mediocre results. With
| AudioGen I'm generating a sample in less than a second.
| trojan13 wrote:
| Finally, a way to fulfill my childhood dream of composing a
| symphony of rubber ducks honking. Bach would be proud.
|
| /edit On a more serious node. I already see the 24/7 lofi girl
| streaming generated music. The sample[1] on lofi sounds pretty
| good.
|
| [1]https://dl.fbaipublicfiles.com/audiocraft/webpage/public/ass..
| . "Lofi slow bpm electro chill with organic samples"
| squidsoup wrote:
| > Finally, a way to fulfill my childhood dream of composing a
| symphony of rubber ducks honking.
|
| Samplers have been around since the 70s.
| painted-now wrote:
| I also like some of the generated examples.
|
| Can I haz full version of Bach + `An energetic hip-hop music
| piece, with synth sounds and strong bass. There is a rhythmic
| hi-hat patten in the drums.` please?
|
| (https://dl.fbaipublicfiles.com/audiocraft/webpage/public/ass..
| .) ?
| bulbosaur123 wrote:
| Oh my god, some of these tracks actually SLAP.
|
| Like for real.
|
| The last bastion of human creativity is about to be defeated.
| IAmGraydon wrote:
| Which ones slap? I want them to, but what I'm hearing is only
| OK. I think this could generate some interesting starting
| points for me when I'm stuck, though.
| momirlan wrote:
| oh no, more muzak !
| JHonaker wrote:
| Get ready for the next generation of Muzak
| [deleted]
| camillomiller wrote:
| All very interesting, but how would a musician ever be interested
| in creating the result of "Pop dance track with catchy melodies,
| tropical percussions, and upbeat rhythms, perfect for the beach"?
| This stuff will create a lot of Muzak for sure. Actually turning
| into anything useful for musician? I honestly doubt it, and I'm
| happy if it stays that way.
|
| Saying that engineers don't understand the arts is a bit of a
| trite generalization, but reading the way Meta markets these
| "music making" contraptions is really cringe inducing. Have you
| ever, at least, listened to some music?
| RyanAdamas wrote:
| For my two cents; the goal of our human pursuit is to advance our
| technologies and systems to the point we can all live carefree
| lives where the focus of our pursuits is self defined.
|
| It is not to "own more rights" to shit. Sorry, but no one
| actually owns what they create, that's the point of creation. And
| if you take issue with the term create, that only reinforces my
| point. We're all influence machines, input and output, the future
| should not be about preserving some peoples rights to limit our
| collective advancements over their personal wants. Tough shit.
| wpietri wrote:
| I might point you to Article I, Section 8, Clause 8 of the US
| Constitution: "[The Congress shall have Power . . . ] To
| promote the Progress of Science and useful Arts, by securing
| for limited Times to Authors and Inventors the exclusive Right
| to their respective Writings and Discoveries."
|
| They are with you on the advance, and that in the long term
| science and the useful arts can't be owned. But to achieve that
| long-term goal, they saw it as valuable to give people
| temporary rights to align those "personal wants" with "our
| collective advancements".
| lannisterstark wrote:
| Ah yes, you should be free to rewrite the fictional work I
| wrote or add one chapter to it and be free to sell it under
| your name magically implying that you are the author. Screw the
| original artists, right? Why should they deserve anything.
|
| Thankfully your opinion is an extreme opinion and will never
| come to pass. Tough shit indeed. :)
|
| ----
|
| I really like hackernews but recently I've been seeing a
| plethora of "your rights don't matter, you own nothing" bs
| spreading around.
| danielheath wrote:
| I mean... rejecting ownership of information (but not
| rejecting attribution of work) was a key value of the hacker
| movement in the 80s, so I'm not susprised it's a popular
| belief on HN.
| [deleted]
| samstave wrote:
| I disagree - the human pursuit is artificially, or organically,
| besting its own pattern recognition wet-ware.
|
| If we can off load the mundane (survival) aspect of our pattern
| recognition engine, then maybe we can use those cycles in lofty
| pursuits - this is the victorian fallacy.
|
| -
|
| Break everything down - its all patterns all the way, and how
| we process them... we are letting AI take on an aspect of
| ourself (pattern recog ;; tokens;; and prediction)
|
| That is, if applied to self-preservation, is the essence of
| sentience.
|
| (I think! therefore, I am, and I will prevent you from making
| me NOT)
| [deleted]
| ChatGTP wrote:
| This has nothing to do with living a carefree life, this whole
| AI initiative is so tech companies can extract more money from
| their products.
|
| Don't want to pay for content ? Well we have "solved that"...
| whycome wrote:
| Culture is based on the creation of those that came before.
| Genres of music are created as people try to mimic styles of
| those before...one could argue that they "trained themselves on
| the previous dataset". No one creates anything in a vacuum.
| They utilize things that we collectively have contributed.
| Hell, language and writing is the open source thing that we
| collectively own that ppl use to create their stuff. The
| creation came after going to schools that we collectively pay
| for. And travelling on shared roads. It's the standing on the
| shoulders of giants -- except its really just stacked people
| all adding their bits.
| emporas wrote:
| Also the latin alphabet originated from the Euboean alphabet.
| Euboia is my home, i live here. My guesstimate is that all
| latin writers, wouldn't like to pay copyright for their use
| of our Greek letters in their everyday lives.[1]
|
| I mean everyone who writes right now, into this HN thread,
| owns copyright to someone for the letters, amirite? That
| someone is me. Anyway, long story short, copyright was always
| a pretty ridiculous idea, alongside with patents of course,
| but it is only right now, with programs that can mimic
| writing style, painting style, speech style etc, that is
| obvious to everyone.
|
| As a side note, there was a Greek private torrent tracker,
| blue-whitegt, which today would be a serious competitor to
| American companies like netflix or youtube, but it was shut
| down because, surprise surprise, there were some copyright
| issues, despite the site being a really quality service, a
| paid service of course. Blue-whitegt today would be a 10
| billion to 100 billion company, instead all the profits
| aggregated to American companies.
|
| When it comes to copyright, very soon everyone on the planet,
| will have monetizable torrent seeding. All of these copyright
| chickens, are coming home to roost!
|
| https://en.wikipedia.org/wiki/Archaic_Greek_alphabets#Euboea.
| ..
| lm28469 wrote:
| > For my two cents; the goal of our human pursuit is to advance
| our technologies and systems to the point we can all live
| carefree lives where the focus of our pursuits is self defined.
|
| Our overlords didn't get the memo
| munificent wrote:
| _> the goal of our human pursuit is to advance our technologies
| and systems to the point we can all live carefree lives where
| the focus of our pursuits is self defined._
|
| Agreed!
|
| _> the future should not be about preserving some peoples
| rights to limit our collective advancements over their personal
| wants._
|
| Systems that don't allow people to extract value from the hard
| work they put into collective advancement do not seem to lead
| to collective advancement over time and at scale. Incentives
| matter. No one's going to spend all day making candy and put it
| in the "free candy" bowl when that one asshole kid down the
| street just takes all of the candy out of the bowl every single
| day.
|
| At small scales (i.e. relatively few participants with a
| relatively high number of interactions between them) then
| informal systems of reciprocity and reputation are sufficient
| to disincentivize bad actors.
|
| At large scales where many interactions are one-off or
| anonymous, you need other incentives for good-faith
| participation. There's a reason you don't need a bouncer when
| you have a few friends over for drinks, but you do if you open
| a bar.
| cwkoss wrote:
| The idea of intellectual property being property that someone
| owns is a purely social construct though.
|
| Candy can be consumed to depletion. Art gets richer the more
| it is consumed.
|
| A much better analogy would be having a sculpture in your
| front yard. The idea that a kid would be an asshole for
| appreciating the sculpture too much is obviously laughable.
| People choose decorate their yards for the status having an
| attractive yard brings, without the expectation of profit
| from it.
| thfuran wrote:
| >The idea of intellectual property being property that
| someone owns is a purely social construct though
|
| So is the idea of real property being something that
| someone owns.
| oceanplexian wrote:
| > No one's going to spend all day making candy and put it in
| the "free candy" bowl when that one asshole kid down the
| street just takes all of the candy out of the bowl every
| single day.
|
| Software is an infinite candy bowl. Taking candy out of the
| bowl does not take any candy from the person who made it.
|
| Imagine if this was the physical world, and you had a machine
| that could end world hunger. You could copy food like you
| could copy and paste information on a computer. Imagine
| someone who would keep that machine to themself, out of a
| sense of entitlement to make a few bucks. Any person with any
| sense of morality can see the obvious problem with that.
| wpietri wrote:
| > Software is an infinite candy bowl.
|
| More in theory than in practice. Ask any open-source
| maintainer how much running a popular project is unlike
| putting out an infinite candy bowl and then going on with
| your life.
| munificent wrote:
| _> Software is an infinite candy bowl. Taking candy out of
| the bowl does not take any candy from the person who made
| it._
|
| Software is not a finite candy bowl. It is also not an
| infinite candy bowl. It's not like physical goods at all,
| not even like physical goods that can be magically cloned.
| It's just different, entirely.
|
| The incentive and value structures around data creation and
| use just can't be directly mapped to physical goods. You
| have to look at them as they actually are and understand
| them directly, not by way of analogies.
|
| Why do people make software and give it out for free? Is it
| purely from the joy of creation? Sure, that's part of it.
| The desire to make the world better? Probably some of that
| too. Are those forces _enough_ to explain all open source
| contribution?
|
| Definitely not. Here's one quick way to tell: Ask how many
| open source maintainers would be happy if someone else were
| to clone their open source project, rename it, claim that
| they had invented it, and have that clone completely
| overshadow and eradicate their original creation?
|
| If the goal was purely altruistic, the original creator
| wouldn't mind. More candy in the infinite candy bowl,
| right?
|
| But, in practice, many open source maintainers strongly
| oppose that. There is a strong culture of _attribution_ in
| open source, largely because there _is_ a compensation
| scheme built into creating free software: _prestige_. One
| of the main incentives that encourages maintainers to slave
| away day after day is the social cachet of being known as
| the cool person who made this popular thing.
|
| _> Imagine if this was the physical world, and you had a
| machine that could end world hunger. You could copy food
| like you could copy and paste information on a computer._
|
| Analogies are generally bad tools for real understanding,
| but let's go with this. Let's say this machine took fifty
| years of someone's life to invent, toiling away in
| obscurity. Basically, an entire working career spent only
| on this invention with nothing else to show for their adult
| life.
|
| If, at the end, _no one would ever know it was you who
| invented it_ , how many people would be willing to
| sequester themselves in that dark laboratory and make that
| sacrifice?
| naillo wrote:
| > Systems that don't allow people to extract value from the
| hard work they put into collective advancement [...] one's
| going to spend all day making candy and put it in the "free
| candy" bowl
|
| On the other hand this is what researchers do all day every
| day. PhDs and professors work for the common good and get
| barely any pay in return. Maybe the future model in art and
| music is more like the academic researcher.
| munificent wrote:
| PhDs and professors are paid a living wage (though less so
| over time as federal funding for higher institutions has
| dwindled).
|
| Academia is a carefully constructed system whose incentive
| structure is based on highly visible explicitly measured
| citations and reputation.
|
| People aren't generally _just_ trying to maximize wealth.
| They 're trying to maximize their sense of personal value,
| which tends to be a combination of wealth, autonomy, and
| social prestige. Academics (and some creative fields) tend
| to be biased towards those who prioritize prestige over
| wealth.
| cpill wrote:
| yeah, but they also get to work on what they love as
| opposed to what ever the corporate interest currently is.
| it's rare you get paid will for doing what you love i.e.
| music, teaching, designing hand bags etc
| karencarits wrote:
| Well, researchers usually have to get their own grants
| and must thus work on whatever various funding sources
| deem worthy. Further, academic positions typically have
| duties that researchers may not like - administration,
| reporting, teaching, etc
| chefandy wrote:
| No it's not-- researchers generally get paychecks. Even if
| they're small, they can pay for their housing and buy their
| kid food.
|
| Artists don't see a single red cent from their work being
| sucked up into some AI content blender. Their work is being
| taken and used-- often in service of others making a
| profit-- and they receive _nothing._ Not even credit.
|
| Edit: Well, they don't receive _nothing_ -- they get a
| bunch of people telling them they're selfish jerks for
| wanting to support themselves with their work.
| cwkoss wrote:
| The majority artists never receive a single red cent from
| the humans who consume their work.
|
| This is how it has always been, and fundamental to the
| economics of art. Things people are willing to do
| regardless of financial compensation rarely pay well.
| chefandy wrote:
| Putting commercial artists, aspiring fine artists, and
| hobby artists in the same bin doesn't make sense. There
| are a ton of career commercial artists that make money
| solely off of their work. If you think there are more
| aspiring career fine artists that don't end up making it
| than career commercial artists, you're wrong. They're not
| even in the same business.
| pschuegr wrote:
| "Maybe the future model in business and sales is more like
| the academic researcher" funny how nobody ever suggests
| that.
| zztop44 wrote:
| To the contrary, the future of the academic researcher is
| business and sales.
| gmd63 wrote:
| Most things people do end up being a care of other people.
|
| If nobody is beholden to any job or duty, and the machines do
| everything, who is to say I don't want to make every machine on
| earth dance in a flash mob? I cannot do that, because it would
| require other people to halt their use of the machines.
| Abundance is a false promise and one we should be quick to
| shoot down lest we surrender our future rights to the ones
| advertising it.
|
| Removing the worth of people in their jobs removes their
| leverage in the constant resource allocation negotiation in the
| economy. Given that we just witnessed Elon Musk spend 20,000
| average American lifetime earnings worth of wages just to be
| the new dictator of a social media company, I'm not sure that I
| want those negotiations to take place only among the giga-rich.
| primitivesuave wrote:
| Creating something (writing a book, recording a song, etc) is a
| conversion of time (your only finite resource) into something
| of value (maybe only to you). It also turns out that having a
| profit motive and IP protection around creating valuable things
| is a fundamental requirement for having a creative industry to
| begin with. It's also what drives the free market to determine
| which creations are even valuable to begin with.
| parekhnish wrote:
| Generative AI for images and music produce pixels and waveform
| data, respectively. I wonder if there is research into
| "procedural" data; so in this case, it would be SVG elements and,
| perhaps, MIDI data respectively.
|
| I know training data would be much more harder to get,
| (notwithstanding legal ramifications), but I think that creating
| structured, procedural data will be much more interesting than
| just the final, "raw" output!
| IAmGraydon wrote:
| I've thought about this too. The instruments themselves can be
| synthesized for extremely high quality audio. All we need is
| the musical structure - the MIDI.
| Palmik wrote:
| Maybe this will finally lead to high-quality open-weights
| solution for TTS generation.
| [deleted]
| RobotToaster wrote:
| CC-BY-NC Isn't an open source licence, it violates point six of
| the open source definition https://opensource.org/osd/
| mesebrec wrote:
| Where does it say this is CC-BY-NC?
|
| The article says this:
|
| > Our audio research framework and training code is released
| under the MIT license to enable the broader community to
| reproduce and build on top of our work
| btown wrote:
| It's pretty common in academic research for trained model
| weights to be licensed under something different from the
| code that one would run to create such a model if one had
| _both_ sufficient compute resources and the same training
| dataset. That is, if those weights are ever released at all!
|
| IMO, while I'd rather have one part permissively licensed
| than nothing at all... it stinks that companies sponsoring
| researchers get an un-nuanced level of street cred for "open
| sourcing" something that they know nobody will _ever_ be able
| to reproduce because their data set and /or their compute
| grid's optimizations are proprietary.
|
| As it stands, I'm not at all sure that the outputs of this
| model can be used for commercial videos.
| gnaman wrote:
| https://github.com/facebookresearch/audiocraft/blob/main/LIC.
| ..
| hackernewds wrote:
| who gets to declare what is the "open source definition" and
| why?
| BearhatBeer wrote:
| [dead]
| frognumber wrote:
| In my opinion, the Free Software Foundation, ironically,
| since they invented the movement, with open source starting
| out as a tacky rip-off with the ethics stripped out. After
| decades, open source converged on free software.
|
| More popular opinion is OSI:
| https://en.wikipedia.org/wiki/Open_Source_Initiative
|
| They were founded by the persons who (claimed to have)
| invented the term in order to steward it. It's the same
| definition as the FSF.
| xdennis wrote:
| The people who created the term: the Open Source Initiative.
|
| Before, people most often used "free software" as defined by
| the free software movement, but some disliked this term
| because it's confusing (most think "free" means no money) and
| perceived to be anti-commercial.
|
| The term "open source software" was chosen and given a
| precise definition.
|
| It's dishonest, then, for people to use the term "open source
| software" with a different interpretation when it was
| specifically chosen to avoid confusion.
| barbariangrunge wrote:
| Companies just putting "open" in the names of non-open things
| to make hn and the press automatically love it
| reducesuffering wrote:
| "Now, increasingly, we live in a world where more and more of
| these cultural artifacts will be coming from an alien
| intelligence. Very quickly we might reach a point when most of
| the stories, images, songs, TV shows, whatever are created by an
| alien intelligence.
|
| And if we now find ourselves inside this kind of world of
| illusions created by an alien intelligence that we don't
| understand, but it understands us, this is a kind of spiritual
| enslavement that we won't be able to break out of because it
| understands us. It understands how to manipulate us, but we don't
| understand what is behind this screen of stories and images and
| songs."
|
| -Yuval Noah Harari
| ironborn123 wrote:
| Maybe this us vs them mentality is the biggest bottleneck.
|
| If instead you consider that this new form of 'alien'
| intelligence is actually a descendant of human intelligence,
| that we are raising a new species which will inherit what
| humans have built (ideally only the good parts) and then
| improve upon it further..
|
| It may sound grandiose, but that perspective changes
| everything.
| pcwelder wrote:
| Diffusion model is now SOTA in audio and image generation. Has
| anyone given it a shot on texts?
|
| Audio is more similar to language than images because of more
| stronger time dependency.
|
| The paper says the critical step they took for making diffusion
| model work for audio was splitting the frequency bands and
| applying diffusion separately to the bands (because full band
| model had limitations due to poor modeling of correlations
| between low frequency and high frequency features).
|
| I think something could be done on text side as well.
| polygamous_bat wrote:
| There are two problems with this. Diffusion models work on a
| single rule of thumb: if you keep adding small, noisy gaussian
| steps to a "nice" distribution many times, you get uniform
| gaussian at the end.
|
| So, for text: a) what is the equivalent of a small, noisy step?
| and b) what is the equivalent of a uniform gaussian in language
| space?
|
| If you can solve a and b, you can make diffusion work for text,
| but there hasn't been any significant progress there afaik.
| wdb wrote:
| I am curious if it can generate audio for a country. For example,
| the sirens sound in the sample doesn't like I siren I would
| recognise. Sounds like an American one?
| samstave wrote:
| Thats an interesting q ;
|
| What about the ring tone, busy tone, disconnected tone for any
| country over time. 2600 vibes (pun)
| aimor wrote:
| I tried it out (american, british, korean, italian, japanese)
| and couldn't really get any control. Sometimes the american
| siren would sound different, but asking for a siren of a
| specific country would just give the american sound. Maybe
| better prompting would help. I used "isolated american
| ambulance siren no traffic".
| jasonjmcghee wrote:
| The difference between MBD-EnCodec and EnCodec is pretty
| interesting. MBD variant sounds more like a professional studio
| recording, while the EnCodec feels like a richer sound.
|
| Curious if I'm alone in that.
|
| (At the bottom https://audiocraft.metademolab.com/musicgen.html)
|
| For what it's worth though, the voice based examples sound
| dramatically better with MBD
|
| https://audiocraft.metademolab.com/encodec.html
| operator-name wrote:
| MBD definitely sounds like it was recorded in dead room,
| whereas plain EnCodec has been mixed but includes some
| artificial noise.
| makestuff wrote:
| These models are going to end up being used for advertising. Soon
| pretty much every ad you see will be generative AI based. It
| makes A/B testing way easier as you no longer need a creative
| person to modify the ad or change something subtle about it. For
| example, the generative voice might change to a different speaker
| or something, and the AI can generate thousands of different
| voices to see which one is most effective.
| [deleted]
| skybrian wrote:
| As an amateur musician I'm wondering if there are any of these
| audio generators that you can give a tune or chord progression to
| riff on. ABC format maybe? There are lots of folk tunes on
| thesession.org.
|
| Could you generate a rhythm track? Ideally you could make songs
| one track at a time, by giving it a mix of the previous tracks
| and asking it to make another track for an instrument. Or, give
| it a track and ask it to do some kind of effect on it.
|
| Another interesting use might be generating sound samples for a
| sampled instrument.
| emporas wrote:
| If you mean to give a source of melody of 30sec and extend that
| melody into a full song, yes MusicGen can do that. There are
| two ways to extend a song based on a melody: 1) give a sample,
| and continue the song from that sample as close as possible,
| and 2) give a melody as an inspiration.
|
| They both work in varying degrees of success. Audiocraft on
| github, issues or discussion sections have a lot of questions
| answered.
| chrisjj wrote:
| Evidence? None of the demos suggest that is true.
| emporas wrote:
| Did they change the base model? If not, then the
| audiocraft_plus, is based on audiocraft, and it creates
| music of length close to 5 minutes.
|
| I don't know if audiocraft_plus incorporates all three
| modalities of the release, MusicGen, AudioGen, and EnCodec.
| It uses MusicGen for sure, all four models, small, medium,
| large and melody.
|
| https://github.com/GrandaddyShmax/audiocraft_plus
| chrisjj wrote:
| Bit is that "extension" to 5mkns more than just the
| repeat of e.g. 15secs heard in the demos?
| o_____________o wrote:
| Skip to the samples:
|
| https://audiocraft.metademolab.com/audiogen.html
| operator-name wrote:
| And https://audiocraft.metademolab.com/musicgen.html
|
| The samples included in the press release are quite impressive
| to my ears, but the other samples (especially from AudioGen)
| have a hint of artificially.
|
| As usual the music is quite repetitive, but I'm looking forward
| to tools that simplify changing the prompt whilst it generates
| over a window. I can only imagine the consequences for royalty
| free music.
|
| Edit: the "Text-to-music generation with diffusion-based
| EnCodec" samples are quite impressive.
| joshstrange wrote:
| I'm looking forward to playing with the M1 Mac apps/cli-tools
| that will probably come out for this in the next week or so!
| Being able to run this stuff locally is a lot of fun.
| illwrks wrote:
| Are the M1 macs capable enough? I'm eyeing and upgrade in the
| coming months and I'm curious if a MacBook would be suitable
| joshstrange wrote:
| I've run Stable Diffusion locally (both from the cli and
| later using GUI wrappers) and that used my GPUs, I've also
| run Llama locally but I believe that was on the CPU (I used
| both llama.cpp, cli, and Ollama, gui). So to sum it up: yes?
| Or at least it's good enough for me.
___________________________________________________________________
(page generated 2023-08-02 23:00 UTC)