[HN Gopher] Stable Audio Open
___________________________________________________________________
Stable Audio Open
Author : davidbarker
Score : 145 points
Date : 2024-06-05 17:19 UTC (5 hours ago)
(HTM) web link (stability.ai)
(TXT) w3m dump (stability.ai)
| minimaxir wrote:
| Note that this has the typical noncommercial "you have to pay for
| a membership to use commercially" Stability license.
| simonw wrote:
| Sigh: Stable Audio Open is an open source
| text-to-audio model [...]
|
| License: https://huggingface.co/stabilityai/stable-audio-
| open-1.0/blo... STABILITY AI NON-COMMERCIAL
| RESEARCH COMMUNITY LICENSE AGREEMENT
|
| Stability are one of the worse offenders for abusing the term
| "open source" at the moment.
| littlestymaar wrote:
| What a shame for something that could have been completely
| free from copyright issues given the training source... (And
| they're still no legal ground on such a license claim since
| training hardly qualify as creative process on their side)
| elpocko wrote:
| Every time I call out the absurd interpretation of "Open
| Source" in this space in general, I get showered with
| downvotes and hateful attacks. One time someone posted their
| "AI mashup" project on reddit, that egregiously violated the
| terms of not only one, but several GPL-licensed projects.
| Calling this out earned me a lot of downvotes and replies
| with absolutely insane justifications from people with no
| clue.
|
| No one cares. Not in this space.
| jeroenhd wrote:
| AI fans don't seem to care much for copyright, unless it's
| their work being stolen (remember the people that got mad
| at "prompt stealing"?).
|
| Companies are more risk-averse, though, and hobbyists on
| Reddit don't have the money to do anything serious with
| this software.
| elpocko wrote:
| Nope. Most of the relevant software is OSS made by
| hobbyists. ComfyUI, llama.cpp, etc. For example, Nvidia
| is building stuff based on ComfyUI, a GPL-licensed
| application.
|
| My complaint is about people ignoring the license of open
| source software made by hobbyists. I disagree with your
| ignorant "AI fans" generalization.
| immibis wrote:
| to be fair, open-source is far too corporate-friendly at the
| moment. it _should_ be more non-commercial. To what extent,
| is an open question.
| jeroenhd wrote:
| Open Source, as used by the most influential open source
| projects, is corporate-friendly by definition.
|
| There are good reasons to use something more aggressive.
| I'm a big fan of the strict copyleft licenses for this,
| even if that means companies like Google don't want to that
| software anymore.
| hehdhdjehehegwv wrote:
| Highly commendable:
|
| "The new model was trained on audio data from FreeSound and the
| Free Music Archive. This allowed us to create an open audio model
| while respecting creator rights."
|
| This should be standard: commons go in, commons go out.
| mastermedo wrote:
| Except here the out is not commons if I understand correctly.
|
| EDIT: might be cc, non commercial
| samfriedman wrote:
| Free idea because I'm never going to get around to building it:
| An "AI 8 track" app; click record and hum a melody, then add a
| prompt and click generate. The model converts your input to an
| instrument matching your prompt, keeping the same notes/rhythm
| you hummed in the original. Record up to 8 tracks and do some
| simple mixing.
|
| Would be a truly amazing thing for sketching songs! All you need
| is decent humming/singing/whistling pitch. Hum and generate a
| bass line, guitar lead, strings, etc. And then sing over it -
| would make solo musicians able to sketch out a song far easier
| than transcribing melody to a piano roll.
| michaelbrave wrote:
| the tech is technically there using 2-5 different AI solutions,
| it mostly lacks an interface that automatically takes one step
| to the next
| xnx wrote:
| Google MusicLM (and probably lots of other tools) do this:
| "MusicLM .. can transform whistled and hummed melodies
| according to the style described in a text caption."
|
| https://google-research.github.io/seanet/musiclm/examples/
| samspenc wrote:
| Sounds like Suno AI will soon have this feature as well
| https://x.com/suno_ai_/status/1794408506407428215
| ortusdux wrote:
| App name - Beat-it
|
| https://www.youtube.com/watch?v=eZeYw1bm53Y
|
| https://www.nme.com/blogs/nme-blogs/the-incredible-way-micha...
| eman2d wrote:
| A killer Ad would be converting each vocal track back into
| the original song
| beoberha wrote:
| Facebook's MusicGen can pretty much do this
| bufferoverflow wrote:
| It produces decent audio, but something unpleasant about its high
| frequencies. And no voices, it doesn't seem to talk or sing.
|
| Udio, so far, is undefeated.
|
| And ElevenLabs' music demos were very very impressive, but it's
| still not released.
| genericacct wrote:
| Have you tried suno? It is quite good at least for some genres
| bufferoverflow wrote:
| Suno is good at generating music, but its voices sound
| metallic with a dash of high-frequency noise. Which ruins it
| for me. It's almost there though, I think they will fix it in
| the next version.
| hierophantical wrote:
| None of these are impressive in the least. Anything I have
| heard from Udio is basically trash. It is the AI art equivalent
| of synthetic cats and pretty face shots. Who cares.
|
| What is ultimately going to be undefeated is training your own
| model.
| bufferoverflow wrote:
| I've heard many good things from Udio and the demo tracks
| from ElevenLabs are very high quality.
|
| https://www.udio.com/songs/ai2uAaBffRGdWdTNNqAbDx
|
| https://www.udio.com/songs/19xQAMG6E1UXG7wNvP7nDW
|
| https://www.udio.com/songs/mPAFYyFgo7Nqjb8ypeFfh9
|
| https://www.udio.com/songs/7sKM9jMwZrXwTTzmMYN9qv
|
| https://www.udio.com/songs/coixNX1gnJ1oWT8z2LQddk
| Uehreka wrote:
| > The new model was trained on audio data from FreeSound and the
| Free Music Archive. This allowed us to create an open audio model
| while respecting creator rights.
|
| This feels like the "Ethereum merge moment" for AI art. Now that
| there exists a prominent example with the big ethical obstacle
| (Proof of Work in the case of Ethereum, nonconsensual data-
| gathering in the case of generative AI) removed, we can actually
| have interesting conversations about the ethics of these things.
|
| In the past I've pushed back on people who made the argument that
| "generative AI intrinsically requires theft from artists", but
| the terrible quality of models trained on public domain data made
| it difficult to make that argument in earnest, even if I knew I
| was right in the abstract.
| rfoo wrote:
| Why is Proof of Work less ethical than Proof of Rich a.k.a.
| rich being gradually more rich without doing anything?
|
| Not saying PoW is safer (it's not), but less ethical is pretty
| a bold claim.
| rpicard wrote:
| Environmental impact of the proof of work algorithms is my
| understanding.
| toenail wrote:
| Proof of work mining is probably the only industry on the
| planet that has the potential to be carbon negative AND
| profitable.
| foota wrote:
| Renewable energy?
| toenail wrote:
| Waste energy like unused methane, flare gas and the like.
| Uehreka wrote:
| (sigh)
|
| Is this the methane flaring argument, or Peter Thiel's
| "windmills in Vermont"?
| toenail wrote:
| (sigh)
|
| Do you expect a reply when you start like this?
| Uehreka wrote:
| I don't really care either way. I'm tired of having to
| debunk the same sloppy arguments year after year.
| Uehreka wrote:
| The environmental impact. And yes, I know, 0.5%, but my issue
| was always that if PoW currencies went from being a niche
| subculture to a point where it was used for everyday exchange
| (many people were arguing that this would and should happen)
| that 0.5% would surely go up by a great deal. To a point
| where crypto had to clear a super high bar of usefulness to
| counterbalance the harm it would do.
|
| To be fair, AI training also has a big carbon footprint, but
| I feel like the utility provided by AI makes it easier to
| argue that its usefulness counterbalances its ecological
| harm.
| avarun wrote:
| There is no "environmental impact". Environmental impact
| comes from energy production, not energy usage. It's
| incoherent to argue others should tamper down their energy
| usage because most folks producing energy aren't doing it
| in an ethical way.
| ben_w wrote:
| Ultimately any proof-of-work system has to burn joules
| rather than clock cycles (because any race on cycles-per-
| joule is rapidly caught up), and that makes it clearer
| where the waste is: to be economically stable, in the
| face of adversarial actions by other nation states who
| sometimes have a vested interest in undermining your
| currency so actively seek the chaos and loss of trust in
| a double-spend event, your currency has to be backed by
| more electricity than any hostile power can spend on
| breaking it.
| skybrian wrote:
| It seems you've come up with a proof that there's no such
| thing as wasting electricity. When you prove an
| extraordinary claim like that, it's time to go back and
| figure out how you got it wrong.
| lolinder wrote:
| > It's incoherent to argue others should tamper down
| their energy usage because most folks producing energy
| aren't doing it in an ethical way.
|
| There's a general consensus that paying someone else to
| do your dirty work doesn't free you of the moral (or,
| usually, legal) culpability for the damage done. If you
| knowingly direct your money towards unethical providers,
| you are directly increasing the demand for unethical
| behavior.
|
| (That's assuming that the producers themselves are
| responsible for the ethics. If a producer is doing its
| best to convert to clean energy as fast as possible, they
| may be entirely in the clear but POW would _still_ be
| unethical. In that scenario POW is placing strain on the
| limited clean energy supplies, forcing the producer to
| use more fossil fuels than they 'd otherwise need to.)
| jncfhnb wrote:
| Officer I merely stabbed the man. What he died from was
| blood loss.
| Sephr wrote:
| This is greenwashing. You're still positively valuing the
| past harms from proof of work.
| pa7x1 wrote:
| How do the rich become gradually richer under PoS? I'm
| flabbergasted by the level of math education.
|
| Assume we have 2 validators in the network; the first one
| owns 90% of the network, the second one owns 10%. Lets call
| them Whale and Shrimpy, respectively.
|
| To make the numbers round let's assume total circulating
| supply of ETH is 100 initially and that the yield resulting
| from being a validator is 10% per year. After the first year,
| 10 new ETH will have been minted. Whale would have gotten 9
| ETH, and Shrimpy would have gotten 1 ETH. OP is assuming that
| as 9 is bigger than 1, Whale is getting richer faster than
| Shrimpy. But, let's look at the final situation globally.
|
| At year 0:
|
| Total ETH circulating supply: 100 ETH
|
| Whale has 90 ETH. Owns 90% of the network.
|
| Shrimpy has 10 ETH. Owns 10% of the network.
|
| At year 1:
|
| Total ETH circulating supply: 110 ETH
|
| Whale has 99 ETH. Owns 90% of the network.
|
| Shrimpy has 11 ETH. Owns 10% of the network.
|
| Whale has exactly the same network ownership after validating
| for 1 whole year, the network is not centralizing at all! The
| rich are not getting richer any faster than the poor.
|
| TL;DR: Friends don't let friends skip elementary math
| classes.
| rfoo wrote:
| Sure, friends also won't let friends skip the fact that
| circulating supply of ETH is now decreasing instead of
| increasing.
|
| Also, only ~30% tokens are staked. The 30% who chose to
| stake essentially tax the other 70% in use. Each of the
| validator do the same amount of work (ok, strictly speaking
| you get to do more when you have more ETH staked, but being
| a validator is cheap and does not cost significantly more
| energy even if you are being selected more frequently
| because running one proposal is too cheap, that's the whole
| environmental point, right?) except what they receive is
| proportioned to how much they stake.
|
| I hate being mean, but sorry, remembering to check one's
| assumption is a habit I gained after elementary school, so
| maybe that's too hard for you.
| pa7x1 wrote:
| > Sure, friends also won't let friends skip the fact that
| circulating supply of ETH is now decreasing instead of
| increasing.
|
| This changes absolutely nothing of the calculation.
| Furthermore, the change in circulating supply last year
| was of 0.07%.
|
| > Also, only ~30% tokens are staked.
|
| Correct.
|
| > The 30% who chose to stake essentially tax the other
| 70% in use.
|
| There is something called opportunity cost. With the
| existence of liquid staking derivatives the choice to
| stake or not is one of opportunity cost. Plenty of people
| may consider the return observed by staking insufficient
| given the opportunity cost and additional risks.
| Participating in staking is fully permissionless, stakers
| are not taxing non-stakers. They are being remunerated
| for their work.
|
| > Each of the validator do exactly same amount of work
| (that's the point, right) except what they receive is
| proportioned to how much they stake.
|
| Incorrect. A staker does proportionate amount of work to
| its stake. That's why it gets paid more. A staker gets
| paid for fulfilling its duties as defined in the protocol
| (attesting, proposing blocks, participating in sync
| committees). For each of those things there are some
| rewards and some punishments in case you fail to fulfill
| them. If a staker has more validators running you simply
| fulfill more of those duties more often, hence your
| reward scales linearly with number of validators.
| rfoo wrote:
| > Participating in staking is fully permissionless,
| stakers are not taxing non-stakers. They are being
| remunerated for their work.
|
| That's just a more polite way to say tax. Being
| permissionless is cool, but it's still tax in my dict.
|
| > There is something called opportunity cost.
|
| And, who is going to be able to have a larger percentage
| of their funds staked, a poor or a whale? You need a
| (mostly) fixed amount of liquidity to use the thing.
|
| > Incorrect. A staker does proportionate amount of work
| to its stake.
|
| Apologies, I edited my original reply which should answer
| this.
|
| In short, I don't see anything preventing me to run 10000
| validators with 32 ETH each with very similar cost to
| running just one. It's certainly not linear.
| pa7x1 wrote:
| > That's just a more polite way to say tax. Being
| permissionless is cool, but it's still tax in my dict.
|
| It most certainly is not. They are doing a work for the
| network and getting remunerated for it. That's not a tax.
| That's what is commonly referred to as a job. A kid that
| delivers newspapers over the weekend is not taxing the
| kid that decides not to. Both make a free decision on
| what to do with their time and effort given how much it's
| worth to them. Running a validator takes skill, time,
| opportunity cost, and you assume certain risks of capital
| loss. You are getting remunerated for it.
|
| > And, who is going to be able to have a larger
| percentage of their funds staked, a poor or a whale? You
| need a (mostly) fixed amount of liquidity to use the
| thing.
|
| Indeed, the protocol cannot solve wealth inequality.
| That's an out of protocol issue. It cannot cure cancer
| either.
|
| > In short, I don't see anything preventing me to run
| 10000 validators with 32 ETH each with very similar cost
| to running just one. It's certainly not linear.
|
| There are some fixed costs, indeed. But they are rather
| negligible. You need a consumer-grade PC (1000 USD) and
| consumer-grade broadband to solo stake. Or you can use a
| Liquid Staking Derivative which will have no fixed costs
| but will have a 10% cut. The curve of APY as a function
| of stake is very flat. Almost anything else around us has
| greater barriers of entry or economies of scale.
| everfree wrote:
| > And, who is going to be able to have a larger
| percentage of their funds staked, a poor or a whale?
|
| This is a truth that's fundamental to all types of
| investing. Advantaged people can set aside millions and
| not touch it for a year or five or twenty. Disadvantaged
| people can't invest $20 because there's a good chance
| they'll need it to buy dinner.
|
| Stocks, bonds, CDs, real estate, it all works like this.
| You've touched on a fundamental property of wealth.
| hanniabu wrote:
| > Also, only ~30% tokens are staked. The 30% who chose to
| stake essentially tax the other 70% in use.
|
| And in PoW miners tax 100% of holders.
|
| > what they receive is proportioned to how much they
| stake
|
| Wealthy miners with state of the art ASICS benefit more
| than some kid mining at home with an old GPU.
| Maintenance/cost of mining equipment benefits from
| economies of scale too.
|
| I hate being mean, but sorry, remembering to check one's
| assumption is a habit I gained after elementary school,
| so maybe that's too hard for you.
| Workaccount2 wrote:
| The idea that AI trained on artist created content is theft is
| kind of ridiculous anyway. Transformers aren't large archives
| of data with needles and thread to sew together pieces. The
| whole argument is meant to stifle an existential threat, not to
| halt some illegal transgression. If they cared about the latter
| a simple copyright filter on the output of the models would be
| all that's needed.
| JohnKemeny wrote:
| I think you should read the case material for NY Times v
| OpenAI and Microsoft.
|
| It literally says that within ChatGPT is stored, verbatim,
| large archives of NY Times articles and that they were able
| to retrieve them through their API.
| Workaccount2 wrote:
| ..which makes no sense. It is either an argument of
| ignorance or of purposeful deceit. There is no coherent
| data corpus (compressed or not) in ChatGPT. What is stored
| are weights that create a string of tokens that can
| recreate excerpts data that it was trained on, with some
| imperfect level of accuracy.
|
| Which I agree is problematic, and OpenAI doesn't have the
| right to disseminate that.
|
| _But that doesn 't mean OpenAI doesn't have the right to
| train on it_.
|
| Content creators are doing a purposeful slight of hand to
| confabulate "outputting copyrighted data" with "training on
| copyrighted data".
|
| It's illegal for me to read an NYT article and recite it
| from memory onto my blog.
|
| It's not illegal for me to read an NYT article and write my
| own summary of the article's contents on my blog. This has
| been true forever and has forever been a staple in new
| content creation.
| Philip-J-Fry wrote:
| When you describe ChatGPT as just a model with weights
| that can create a string of tokens, is it any different
| from any lossless compression algorithm?
|
| I'm sure if I had a JPEG of some copyrighted raw image it
| could still be argued that it is the same image. JPEG is
| imperfect, the result you get is the same every time you
| open it but it's not the same as the original input data.
|
| ChatGPT would give you the same output every time, and it
| does if you turn off the "temperature" setting. Introduce
| a bit of randomness into a JPEG decoder and functionally
| what's the difference? A slightly different string of
| tokens for ChatGPT versus a slightly different collection
| of pixels for a JPEG.
| CyberDildonics wrote:
| Did you mean lossy compression algorithm? That would make
| sense.
| bckr wrote:
| > There is no coherent data corpus (compressed or not) in
| ChatGPT.
|
| I disagree.
|
| If you can get the model to output an article verbatim,
| then that article is stored in that model.
|
| Just because it's not stored in the same format is
| meaningless. It's the same content regardless of whether
| it's stored as plaintext, compressed text, PDF, png, or
| weights in a model.
|
| Just because you need an algorithm such as a specialized
| prompt to retrieve this memorized data, is also
| irrelevant. Text files need to be interpreted in order to
| display them meaningfully, as well.
| cthalupa wrote:
| > If you can get the model to output an article verbatim,
| then that article is stored in that model.
|
| You can't get it to do that, though.[1]
|
| The NYT vs OpenAI case, if anything, shows that even with
| significant effort trying to get a model to regurgitate
| specific work, it cannot do it. They found articles it
| had overfit on due to snippets being reposted elsewhere
| across the internet, and they could only get it to output
| those snippets, and not in correct order. The NYT,
| knowing the correct order, re-arranged them to fit the
| ordering in the article.
|
| Even doing this, they were only able to get a hundred or
| so words out of the 15k+ word articles.
|
| No one who knows anything about these models disagrees
| that overfitting can cause this sort of behavior, but the
| overwhelming majority of the data in these models is not
| overfit and they take a lot of care to resolve the issue
| - overfitting isn't desirable for general purpose model
| performance even if you don't give a shit about copyright
| laws at all.
|
| People liken it to compression, like the GP mentioned,
| and in some ways, it really is. But in the most real
| sense, even with the incredibly efficient "compression"
| the models do, there's simply no way for them to actually
| store all this training data people seem to think is
| hidden in there, if you just prompt it the right way. The
| reality is only the tiniest fraction of overfit data can
| be recovered this way. That doesn't mean that the overfit
| parts can't be copyright infringing, but that's a very
| separate argument than the general idea that these are
| constantly putting out a deluge of copyrighted material.
|
| (None of this goes for toy models with tiny datasets,
| people intentionally training models to overfit on data,
| etc. but instead the "big" models like GPT, Claude,
| Llama, etc.)
|
| 1. https://fingfx.thomsonreuters.com/gfx/legaldocs/byvrkx
| bmgpe/...
| bckr wrote:
| > The NYT, knowing the correct order, re-arranged them to
| fit the ordering in the article.
|
| > Even doing this, they were only able to get a hundred
| or so words out of the 15k+ word articles.
|
| OK, that's less material than I believed, which shows the
| details matter. But we agree that the overfit material,
| while limited, is stored in the model.
|
| Of course, this can be (and surely is) mitigated by
| filtering the output, as long as the product is the
| output and not the model itself.
| semi wrote:
| >Just because you need an algorithm such as a specialized
| prompt to retrieve this memorized data, is also
| irrelevant.
|
| I disagree. Granted I'm a layman and not a lawyer so I
| have no clue how the court feels. But I can certainly
| make very specialized algorithms to produce whatever
| output I want from whatever input I want, and that
| shouldn't let me declare any input as infringing on any
| rights.
|
| For the reducto ad absurdum example: I demand everyone
| stops using spaces, using the algorithm 'remove a space
| and add my copyrighted text' it produces an identical
| copy of my copyrighted text.
|
| For the less absurd example.. if I took any clean model
| without your copyrighted text, and brute forced prompts
| and settings until I produced your text, is your model
| violating the copyright or is my inputs?
| SahAssar wrote:
| > Content creators are doing a purposeful slight of hand
| to confabulate "outputting copyrighted data" with
| "training on copyrighted data".
|
| I don't think so, I think it's usually argued as two
| different things.
|
| The "training on copyrighted data" argument is usually
| that we never licensed this work for this sort of use and
| it is different enough from previously licensed uses that
| it should be treated differently.
|
| The "outputting copyrighted data" argument is somewhat
| like your output is so similar as to constitute a (at
| least) partial copy.
|
| Another argument is that licensed data is whitewashed by
| being run through a model. So you could have GPL licensed
| code that is open source run through a model and then
| output exactly the same but because it has been outputted
| by the model it is considered "cleaned" from the GPL
| restrictions. Clearly this output should still be GPL:ed.
|
| > It's not illegal for me to read an NYT article and
| write my own summary of the article's contents on my
| blog. This has been true forever and has forever been a
| staple in new content creation.
|
| What if I compress the NYT article with gzip? What if I
| build a LLM model that always replies with the full
| article within 99% accuracy? Where is the line?
|
| This is not a technical issue, we need to decide on this
| just like we did with copyright, trademarks, etc.
| Regardless of what you think this is not a non-issue and
| we cant use the same rules as we did up until now unless
| we treat all ML systems as either duplication or humans
| and neither seems to solve the issues.
| freedomben wrote:
| > _Another argument is that licensed data is whitewashed
| by being run through a model. So you could have GPL
| licensed code that is open source run through a model and
| then output exactly the same but because it has been
| outputted by the model it is considered "cleaned" from
| the GPL restrictions. Clearly this output should still be
| GPL:ed._
|
| I don't think anybody is making that argument. The NY
| Times claims to have gotten ChatGPT to spit out NY Times
| articles verbatim but there is considerable doubt about
| that. Regardless, everyone agrees that a verbatim (or
| close to) copy is copyright violation, even OpenAI. Every
| serious model has taken steps to prevent that sort of
| thing.
| neuralRiot wrote:
| > It's not illegal for me to read an NYT article and
| write my own summary of the article's contents on my
| blog. This has been true forever and has forever been a
| staple in new content creation.
|
| It's not that clear-cut. It falls into the "Fair use
| doctrine"The cose 107 of the US copyright law states that
| the resolutiodepends on>
|
| > (1) the purpose and character of the use, including
| whether such use is of a commercial nature or is for
| nonprofit educational purposes; (2) the nature of the
| copyrighted work; (3) the amount and substantiality of
| the portion used in relation to the copyrighted work as a
| whole; and (4) the effect of the use upon the potential
| market for or value of the copyrighted work.
|
| Another thing we need to consider is that the law was
| redacted with the human mind limitations as a unconcious
| factor, (i.e not many people would be able to recite War
| and peace verbatim from memory). This just brings up the
| fact that copyright law needs a complete re-think.
| mey wrote:
| Our copyright model isn't sufficient yet. Is putting a work
| through a training/model sufficient to clear the
| transformative use bar? That doesn't make you safe from
| Trademarks. If the model can produce outputs on the other
| side that aren't sufficiently transformative then that single
| instance is a copyright violation.
|
| Honestly, instead of trying to cleanup the output, it's much
| safer to create a licensed input corpus. People haven't
| because it's expensive and time consuming. Every time I
| engage with an AI vendor, my first question is do you
| indemnify from copyright violations of your output. I was
| shocked that Google Gemini/Bard only added that this year.
| ffsm8 wrote:
| I'm honestly surprised AI-washing hasn't become way more
| widespread then it is at this point.
|
| I mean recording a good song is hard. Generating a good
| song almost impossible. But my gut feeling would've been
| that recreating a popular song for plausible deniability
| would be a lot easier.
|
| Same with republishing bestselling books and related media.
| (I.e. take Lord of the rings and feed it paragraph for
| paragraph into an LLM that you've prompted to rephrase each
| to a currently bestselling author.)
| jncfhnb wrote:
| Nothing will ever protect you from trademark violations
| because trademarks can be violated completely by accident
| without knowledge of the real work. Copying is not the
| issue.
| QuantumGood wrote:
| NY Times v OpenAI and Microsoft says the opposite, that
| verbatim, large archives of NY Times articles were retrieved
| via API. This may or may not matter to how LLMs work, but
| "large archive" seems accurate, other than semantic arguments
| (e.g. "Compressed archive" may be semantically more
| accurate).
| cthalupa wrote:
| > NY Times v OpenAI and Microsoft says the opposite, that
| verbatim, large archives of NY Times articles were
| retrieved via API.
|
| This does not match my understanding of the information
| available in the complaint. They might claim they were able
| to do this, but the complaint itself provides some specific
| examples that OpenAI and Microsoft discuss in a motion to
| dismiss... and I think the motion does a very strong job of
| dismantling that argument based on said examples.
|
| https://fingfx.thomsonreuters.com/gfx/legaldocs/byvrkxbmgpe
| /...
| tomcam wrote:
| Yet before "safeguards" were added a prompt could say "in the
| style of Studio Ghibli" and you could get exactly that.
|
| Would it be possible if Studio Ghibli images had not been
| used in the training?
| semi wrote:
| if it was trained on a sufficient amount of fan art made in
| the studio Ghibli style and tagged as such, yes.
|
| otherwise those would just be unknown words, same as asking
| an artist to do that without any examples.
|
| though I am curious how performance would differ between
| training on only actual studio Ghibli art, only fan art, or
| a mix. Maybe the fan art could convey what we expect
| 'studio Ghibli style' to be even more, whereas actual art
| from them could have other similarities that that tag
| conveys.
| Unai wrote:
| I don't understand. If I make a painting (hell, or a whole
| animated movie) in the style of Studio Ghibli, am I
| infringing their copyright? I don't think so. A style is
| just an idea, if you want to protect an idea to the point
| of no one even getting inspired by it just don't let it out
| of your brain.
|
| If the produced work is not a copy, why does it matter if
| it was generated by a biological brain or by a mechanical
| one?
| jrm4 wrote:
| I fail to see how the argument is ridiculous; and I'll bet
| that a jury would find the idea that "there is a copy inside"
| at least reasonable, especially if you start with the premise
| that "the machine is not a human being."
|
| What you're left with is a machine that produces "things that
| strongly resemble the original, that would not have been
| produced, had you not fed the original into the machine."
|
| The fact that there's no "exact copy inside" the machine
| seems a lot like splitting hairs; like saying "Well, there's
| no paper inside the hard drive so the essence of what is
| copyable in a book can't be in it"
| GaggiX wrote:
| Having exact copies of the samples inside the model weights
| would be an extremely inefficient use of space, and also it
| would not generalize, unless it generated a copy so close
| to the original that it would violate copyright law if
| used, I wouldn't find it very reasonable to think that
| there is a memorized copy inside the model weights
| somewhere.
| ziofill wrote:
| A program that can produce copies is the same as a copy.
| How that copy comes into being (whether out of an
| algorithm or read from a support) is related, but not
| relevant.
| LordDragonfang wrote:
| >A program that can produce copies is the same as a copy.
|
| A program that _always_ produces copies is the same as a
| copy. A program that merely _can_ produce copies
| categorically is not.
|
| The Library of Babel[1] can produce copyrighted works,
| and for that matter so can any random number generator,
| but in almost every normal circumstance will not. The
| same is true for LLMs and diffusion models. While there
| are some circumstance that you can produce copies of a
| work, in natural use that's only for things that will
| come up thousands of times in its training set -- by and
| large, famous works in the public domain, or cultural
| touch-stones so iconic that they're essentially
| genericized (one main copyrighted example are the
| officially released promo materials for movies).
|
| [1] https://libraryofbabel.info/
| GaggiX wrote:
| Yeah that's right, I doubt that a model would generate an
| image or text so close to a real one to violate copyright
| law just by pure chance, the image/text space is
| incredibly large.
| Arainach wrote:
| An MP3 file is a lossy copy, but is still copyright
| infringement.
|
| Copyright infringement doesn't require exact copies.
| GaggiX wrote:
| I didn't say it takes an exact copy for copyright
| infringement.
| Workaccount2 wrote:
| If a made a bot that read amazon reviews and then output a
| meta-review for me, would that be a violation of amazon's
| copyright? (I'm sure somewhere in the amazon ToS they claim
| all ownership rights of reviews).
|
| If it output those reviews verbatim, sure I can see the
| issue, the model is over fitting. But if I tweak the model
| or filter the output to avoid verbatim excerpts, does an
| amazon lawyer have a solid footing for a "violation of
| copyright" lawsuit?
| jononor wrote:
| As far as I understand, according to current copyright
| practices: If you sing a song that someone else has
| written, or pieces thereof, you are in violation. This is
| also the case of you switch out the instrumentation
| completely, say play trumpet instead of guitar, or a male
| choir sings a female line. If on would make a medley many
| such parts, it is not automatically not violation anymore
| either. So we do have examples of things being very far
| from verbatim copy, being considered violations.
| lisperforlife wrote:
| I am curious about models like encodec or soundstream. They
| are essentially meant to be codecs informed by the music
| they are meant to compress to achieve insane compression
| ratios. The decompression process is indeed generative
| since a part of the information that is meant to be decoded
| is in the decoder weights. Does that pass the smell test
| from a copyright law's perspective? I believe such a
| decoder model is powering gpt-4o's audio decoding.
| kimixa wrote:
| I think the definition between "Lossy Compression" and
| "Trained AI" is... vague according to the current legal
| definitions. Or even "lossless" in some cases - as shown by
| people being able to get written articles output verbatim.
|
| While the extremes are obvious, there's a big stretch of gray
| in the middle. A similar issue occurs in non-AI art, the
| difference between inspiration and tracing/copying isn't well
| defined either, but the current method of dealing with that
| (being on a case-by-case basis and a human judging the
| difference) clearly cannot scale to the level that many
| people intend to use these tools.
| cthalupa wrote:
| Has anyone been able to actually get a verbatim copy of a
| written article? The NYT got a ~100 word fragment made up
| of multiple snippets of a ~15k word article, with the
| different snippets not even being in order. (The Times had
| to re-arrange the snippets to match the article after the
| fact)
|
| I am simply not aware of anyone successfully doing this.
| kimixa wrote:
| The amount of content required to call it a "Copy" is
| also a gray area.
|
| Same with the idea of "prompting" and the amount required
| to generate that copywritten output - again there's the
| extremes of "The prompt includes copywritten information"
| to "Vague description".
|
| Arguably some of the same issues exist outside AI, just
| it's accessibility, scale, and lack of a "Legal
| Individual" on one side complicates things. For example,
| if I describe Micky Mouse sufficiently accurately to an
| artist they reproduce it to the degree it's considered
| copyright infringement, is it me or the artist that did
| the infringement? Then what if the artist /had/ seen the
| previously copywritten artwork, but still produced the
| same output from that same detailed prompt?
| immibis wrote:
| What's good for the goose is good for the gander. It may or
| may not be like theft, but either way, if one of us trained
| an AI on Hollywood movies, you best believe we'd get sued for
| eleventy billion dollars and lose. It's only fair that we
| hold corporations to the same standard.
| hecanjog wrote:
| I also highly doubt anyone who signed agreements to have their
| music included in the Free Music Archive would have been OK
| with this. The particular type of license was important to
| contributors and there's a difference between allowing for
| rebroadcast without paying royalties and allowing for
| derivative works... I don't really care to argue the point, but
| it's why there were so many different types of licenses for the
| original FMA. This just glosses over all that.
| blargey wrote:
| If you look at the repo where the model is actually hosted
| they specify
|
| > All audio files are licensed under CC0, CC BY, or CC
| Sampling+.
|
| These explicitly permit derivative works and commercial use.
|
| > Attribution for all audio recordings used to train Stable
| Audio Open 1.0 can be found in this repository.
|
| So it's not being glossed over, and licenses are being abided
| by in good faith imo.
|
| I wish they'd just added a sentence to their press release
| specifying this, though, since I agree it looks suspect if
| all you have to go by is that one line.
|
| (Link: https://huggingface.co/stabilityai/stable-audio-
| open-1.0#dat... )
| TaylorAlexander wrote:
| I'm so happy to see this! I've been saying for a while, if they
| focused on sample efficiency and building large public
| datasets, including encouraging Twitter and other social media
| sites to add image license options and also encouraging people
| to add alt text (which would also help the vision impaired!),
| they really could build the models they want while also
| respecting creatives, thus avoiding pissing a bunch of people
| off. It's nice to see Stability step up and actually train on
| open data!
| ancientworldnow wrote:
| This has been Adobe Firefly's value proposition for months now.
| It works fine and is already being utilized in professional
| workflows with the blessing of lawyers.
| hapticmonkey wrote:
| If you're worried about Proof of Work leading to giant server
| farms using huge amounts of energy, then I've got something to
| tell you about AI...
| Sephr wrote:
| The "Etherium merge moment" is entirely different and it irks
| me to see it compared favorably with this project. It didn't
| solve proof of work at all, as it assigned positive value to
| past environmental harms.
|
| The only 'solution' (more a mitigation) to Etherium proof of
| work's environmental harms is to devalue it.
|
| Unlike your example, this project actually seems to be a net
| positive for society that wasn't built on top of clear and
| obvious harms.
| nickthegreek wrote:
| I keep hearing about the pending death of Stability, but here we
| are with another release. I am rootin for them.
| treesciencebot wrote:
| This looks like the one that got leaked a couple weeks ago, so i
| guess they decided its better to open source at this point after
| the leak [0].
|
| [0]: https://x.com/cto_junior/status/1794632281593893326
| tmabraham wrote:
| it was already planned for open-sourcing, the leak did not
| affect the plans in any way
| washadjeffmad wrote:
| It is. The model.ckpt from petra-hi-small matches the official
| HF repo.
|
| SHA256: 6049ae92ec8362804cb4cb8a2845be93071439da2daff9997c285f8
| 119d7ea40
| mg wrote:
| When they released Stable Audio 2.0, I tried to create "unusual"
| songs with prompts like "roaring dragon tumbling rocks stormy
| morning". The results are quite interesting:
|
| https://www.youtube.com/@MarekGibney/videos
|
| I find it fascinating that you can put all information needed to
| recreate a whole complex song into a string like
| rough stormy morning car rocks hammering drum solo
| roaring dragon downtempo audiosparx-v2-0 seed 5
|
| This means a whole album of these songs could easily fit into a
| single TCP/IP packet.
|
| If a music genre evolves in which each song is completely defined
| by its title, maybe it will be called "promptmusic".
|
| I will try the new model with the same prompts and upload the
| results.
| TeMPOraL wrote:
| That's a great example of the fact that information about
| something, say a song, isn't entirely encoded only in the
| medium you use to transfer it - it's partially there, and
| partially in the device you're using to read it! An MP3 file is
| just gibberish without a program that can decode it.
|
| In this case, the whole album could indeed fit into a single
| TCP/IP packet - because the bulk of information that make up
| those songs is contained in the model, which weights however
| many gigabytes it does. The packet carrying your album is
| meaningless until the recipient also procures the model.
|
| (Tangent: this observation was my first _mind. blown._
| experience when reading GEB over a decade ago.)
| drivebyhooting wrote:
| From announcement I couldn't figure out if it can do audio to
| audio.
|
| Text to audio is too limiting. I'd rather input a melody or a
| drum beat and have the AI compose around it.
| duranduran wrote:
| This kind of exists, but I doubt there are any commercial
| solutions based on it yet.
| https://crfm.stanford.edu/2023/06/16/anticipatory-music-tran...
|
| Their paper says that they trained it on the Lakh MIDI dataset,
| and they have a section on potential copyright issues as a
| result.
|
| Assuming you don't care for legal issues, theoretically you
| could do: raw signal -> something like Spotify Basic Pitch
| (outputs MIDI) -> Anticipatory (outputs composition) -> Logic
| Pro/Ableton/etc + Native Instruments plugin suite for full song
| ben_w wrote:
| > Warm arpeggios on an analog synthesizer with a gradually rising
| filter cutoff and a reverb tail
|
| I appreciate that the underlying tech is completely different and
| much more powerful, but it is a pretty strange feeling to find a
| major AI lab's example sounding so similar to an actual Markov
| chain MIDI generator I made 14-15 years ago:
| https://youtu.be/depj8C21YHg?si=74a4DHP14EFCeYrB
|
| (Not _that_ similar, just enough for me to go "huh, what a
| coincidence").
| lancesells wrote:
| "a drummer could fine-tune on samples of their own drum
| recordings to generate new beats"
|
| Yes, this is the reason someone becomes a drummer.
___________________________________________________________________
(page generated 2024-06-05 23:01 UTC)