[HN Gopher] OpenVoice: Versatile Instant Voice Cloning
___________________________________________________________________
OpenVoice: Versatile Instant Voice Cloning
Author : saeedesmaili
Score : 258 points
Date : 2024-01-01 15:16 UTC (7 hours ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| peddling-brink wrote:
| GitHub: https://github.com/myshell-ai/OpenVoice Checkpoint:
| hxxps://myshell-public-repo-
| hosting.s3.amazonaws.com/checkpoints_1226.zip
|
| (Checkpoint link defanged because I'm allergic to direct links to
| zip files hosted on Amazon. Nor have I reviewed what the file
| contains.)
| crazysim wrote:
| Thanks for the link to the repo. It's very useful.
|
| As for the checkpoint, I'm not allergic and I don't do security
| theater:
|
| https://github.com/myshell-ai/OpenVoice?tab=readme-ov-file#i...
| links to
|
| https://myshell-public-repo-hosting.s3.amazonaws.com/checkpo...
| peddling-brink wrote:
| Why do you call that security theater? I found and provided
| the information, but didn't make it clickable. Anyone can
| decide for themselves to navigate there.
|
| Your comment comes off as passive aggressive.
| IshKebab wrote:
| I think he's referring to your "defanging" which you
| implied was security related but doesn't actually achieve
| anything at all.
| fieldcny wrote:
| They are making you think about what you are doing before
| you click the link. that's not theatre that's keeping
| people from clicking arbitrary links to zip files which
| can auto-execute code once downloaded.
|
| I'd suggest that those who think it is theatre probably
| don't understand the implications of that action.
| arccy wrote:
| just downloading a zip file won't auto execute anything.
| and you can't meaningfully review it without downloading
| it, so it pretty much is security theatre
| seabass-labrax wrote:
| On which operating systems can Zip files automatically
| self-execute? Android .APKs come to mind, although in
| this case, Android asks you whether you want to install
| the application and thus gives you a chance to prevent
| the execution.
| IshKebab wrote:
| We understand exactly the implications of that action.
| There are no implications.
|
| Simply downloading a zip from Amazon has zero risk. Even
| _opening_ an arbitrary zip has essentially zero risk. RCE
| from opening a zip is obviously a really critical and
| valuable vulnerability and would not be wasted with a
| public link.
|
| Combine that with the fact that this comes from a voice
| cloning GitHub repo and the chance of this having some
| 0-day are infinitesimal.
|
| Finally just making the link non-clickable does not add
| security. Nobody can take any action to increase their
| security because they have to slightly edit a link (not
| that they would because it's sensible a clickable link in
| the GitHub readme).
|
| So yes, I fully understand the implications and it is
| definitely security theatre.
|
| I suggest that those who think that it _isn 't_ probably
| haven't really thought about the threat model.
| janalsncm wrote:
| What is the threat vector of the functional https link that
| hxxps solves?
| dotancohen wrote:
| What does allergic mean in this context?
| peddling-brink wrote:
| That file could contain anything. I don't know the authors or
| have any idea of their reputation.
|
| I wanted to expose it so people didn't have to comb through
| the github, but decided to make it unclickable out of an
| abundance of caution. This appears to have offended people.
|
| I would not have hesitated to link to hugging face. That is a
| known quantity.
| chrisweekly wrote:
| FWIW I appreciate the courtesy and context; agreed that
| it's not the best idea to link directly to zip files (let
| alone those of questionable provenance).
| dcreater wrote:
| Any GitHub link?
| saeedesmaili wrote:
| Github: https://github.com/myshell-ai/OpenVoice Demo:
| https://research.myshell.ai/open-voice
| smellf wrote:
| Examples: https://research.myshell.ai/open-voice
|
| Seems impressive!
| colesantiago wrote:
| > This repository is licensed under a Creative Commons
| Attribution-NonCommercial 4.0 International License, which
| prohibits commercial usage. MyShell reserves the ability to
| detect whether an audio is generated by OpenVoice, no matter
| whether the watermark is added or not.
|
| So it is not 'open' then and you cannot make money out of this?
| DandyDev wrote:
| It is open, just not by your definition. You can view, use and
| modify the code to your hearts content. Sounds pretty open to
| me!
| jahewson wrote:
| Open for business!
|
| No wait...
| CaptainFever wrote:
| To be specific, while it is not a bad license, it does not
| quality for the Free Cultural Works mark as defined by the
| Creative Commons and Freedom Defined:
| https://creativecommons.org/public-domain/freeworks/
| bbor wrote:
| Well... "use" isn't exactly free, this the complaint. On a
| scale of free to not free, "cannot use this for my work" is a
| pretty big jump to the latter end IMO
| c0pium wrote:
| Careful, you're saying the quiet part out loud; freedom is
| about profiting off the uncompensated work of others.
| bbor wrote:
| Well ultimately we all need to eat. If someone wants to
| be compensated in today's society, they either need to
| join a gift-based sub-society (see: OSS foundations, NGOs
| in general) or sell something. Trust me, I totally agree
| that freedom of information should be a completely
| separate concern from resource allocation
|
| EDIT: I guess there's a third option, "work another job
| and use OSS on your off hours". Which feels... idk,
| disrespectful of the whole enterprise. OSS software
| development is important enough to deserve a wage IMO, to
| say the least
| satvikpendem wrote:
| To your last point, people pay what the market will bear.
| In this case, it's free, so don't be surprised that if
| you give something away for free that people, well, take
| it for free. Importance has nothing to do with it.
| beardog wrote:
| As long as your hearts content isn't commercial
| cjbprime wrote:
| And not by opensource.org's definition, which prohibits use
| restrictions. It's not reasonable to act like OP is being
| idiosyncratic when this fails to meet the protected
| definition of "open source".
| gpm wrote:
| The term "open source" is not protected, the OSI
| (opensource.org) attempted and failed to acquire a
| trademark on that term.
| cjbprime wrote:
| Fair enough. Is there _any_ shared definition of "open
| source" which permits use restrictions, then?
| abetusk wrote:
| By the commonly held definition of open, in the context of
| "open source", it is not open.
|
| > You can view, use and modify the code to your hearts
| content.
|
| The non-commercial clause of their license specifically
| prohibits commercial use, so we cannot use this source, and
| presumably the data that the source uses, to our hearts
| content.
|
| The OSI has a definition of open source that clearly states
| commercial use is required [0].
|
| Wikipedias entry on Open Source Licensing also stipulates
| that commercial re-use is required [1].
|
| There is a term called "source available" which is more in
| line with your intent.
|
| [0] https://opensource.org/osd/
|
| [1] https://en.wikipedia.org/wiki/Open-source_license
| throwup238 wrote:
| You can't. Scammers who don't care about noncommercial licenses
| sure can!
| pclmulqdq wrote:
| Yep, this is one of those "only bad actors" licenses,
| probably as a cash grab.
|
| It will _definitely_ stop those bad actors from scamming
| people this time, right? Right?
| evanmoran wrote:
| This is the most insightful take. Licenses like this prevent
| certain businesses in certain countries, but it is quite
| harmful as it adds a powerful tool for
| propaganda/scammers/etc who don't care about the laws.
|
| Additionally, it only really hurts small businesses &
| startups as the big companies all have teams that can make
| their own version or pay for 3rd party apis for easily. So
| yeah, us startup folks won't like this license much as it
| basically is aimed at us the most.
|
| Either way, congrats with the tech. It does look very
| impressive!
| cyanydeez wrote:
| erm, it's existence provides to scammers.
|
| unless you're proposing it's use in detecting itself is
| some how symmetrical, which I really don't think is
| anything but unproven conjecture.
| iAkashPaul wrote:
| That watermark detection rights at the end is real sus
| diggan wrote:
| What exactly are you talking about? The paper doesn't mention
| any watermark at all, as far as I can see/search.
| cwillu wrote:
| The readme on the linked github reads: "MyShell reserves the
| ability to detect whether an audio is generated by OpenVoice,
| no matter whether the watermark is added or not."
| lostlogin wrote:
| As you say, right at the bottom https://github.com/myshell-
| ai/OpenVoice
| diggan wrote:
| Ah, thank you. Guess that's OK that the company/service
| do whatever they want, the paper/technique doesn't
| involve watermarks, so it'd be easy to remove/modify
| whatever they do in the library/service itself.
| fbdab103 wrote:
| At least right now, there is a literal add_watermark function,
| so probably easy enough to remove that surface level. Unless
| they added something cute to the training data to poison the
| well.
|
| https://github.com/myshell-ai/OpenVoice/blob/a33963c3d764bee...
| huqedato wrote:
| Welcome to the new era of fakes and scams beyond our wildest
| imagination !
| danielbln wrote:
| Elevenlabs has been around for a while now. Genie has been out
| of the bottle for a bit, and the sooner the notion that
| anything digital can be easily faked seeps into the wider
| consciousness the better. Trust nothing.
| ethanbond wrote:
| It can both be true that people need to adapt/"trust
| nothing," and that this is bad.
| ignu wrote:
| I've seen some prank calls (a YouTuber cloned Tucker
| Carlson's voice and called Alex Jones) but he just had a
| sound bank with a few pre-generated lines and it fell apart
| pretty quickly.
|
| At least for now there's too much lag to do a real time
| conversation with a cloned voice.
|
| Speech to Text > LLM Response > Generate Audio
|
| If that time can shrink to subsecond, I think there'll be
| madness. (Specifically thinking of romance scammers)
| shinycode wrote:
| Awful, bots on their own having real conversations with
| people with the voice of a loved one. Scamming on steroids
| ben_w wrote:
| At last summer's WeAreDevelopers World Congress in Berlin,
| one of the talks I went to was by someone who did this with
| their own voice, to better respond to (really long?)
| WhatsApp messages they kept getting.
|
| It worked a bit too well, as it could parse the sound file
| and generate a complete response faster than real-time,
| leading people to ask if he'd actually listened to the
| messages they sent him.
|
| Also they had trouble believing him when he told them how
| he'd done it.
| smt88 wrote:
| > * the notion that anything digital can be easily faked
| seeps into the wider consciousness the better. Trust
| nothing.*
|
| This is a society-destroying idea.
|
| Most of us, especially younger people, only know how to vote,
| where there are wars, or even what our parents are doing by
| using digital media.
|
| If digital media becomes untrustworthy, everyone will live in
| a warped and fragile alternate reality that no one can agree
| on.
| diggan wrote:
| > Trust nothing
|
| > This is a society-destroying idea.
|
| Believe it or not, this is how much of the population saw
| The Internet when it first came close to being mainstream.
| Everyone and their mother said "Don't believe anything you
| read on the cybernet", which ended up ironic as everyone
| and their mother ended up being the ones to believe
| anything on the cybernet anyways.
|
| > everyone will live in a warped and fragile alternate
| reality that no one can agree on.
|
| How is this any different from today? The various corners
| of the internet (which is mostly divided by languages:
| English, Russian, Spanish, Chinese and Portuguese) already
| have these vastly different realities and ground-truths.
|
| I'm sure we could survive another Internet-Winter where
| people trust everything a bit less than today.
| smt88 wrote:
| It's vastly different than today because today (or at
| least a few years ago), I could trust videos and voices
| delivered digitally. I can't do that anymore.
| ilikehurdles wrote:
| Technology and society will adapt, just as we adapted
| encryption to verify credentials and secure banking data
| online, we'll end up with a validation signal for video
| and audio.
| treprinum wrote:
| VALL-E is on Github for over a year already...
| underlines wrote:
| This aera is barely new. Look at how old some of the projects
| are:
|
| https://github.com/underlines/awesome-ml/blob/master/audio-a...
|
| The thing that changes is the complexity to run it. I was
| training my wife's voice and my voice for fun and needed 15min
| of audio and trained on my 3080 for 40 minutes.
|
| Now it's 2 Minutes.
| thfuran wrote:
| Yes, and the more accessible it is, the more widespread it
| will be.
| ponector wrote:
| Maybe this will teach people to rise up awareness and take
| personal security serious. Like not to trust anyone who is
| calling, especially from legacy line. Phone number and voice
| could be easily cloned.
| gnfargbl wrote:
| I had to phone my bank, which is one of the bigger players in the
| UK high street market, a couple of days ago. They're _still_
| encouraging me to enroll in their idiotic "my voice is my
| password" programme. At this stage in the evolution of AI, that
| feels simply negligent.
| toss1 wrote:
| Fidelity Investments just did something even worse ~a week ago
| - It asked me to reply to a few questions, then announced that
| I'd just been enrolled in it's voice identification program (or
| whatever they call it).
|
| Now I've got Just Another Item on my ToDo list, to get that
| undone. Gawd, does every company promote it's stupidest people
| to management?
| crazysim wrote:
| Clone management's voices and post it to their social
| media/etc. Super undo it!
| hasty_pudding wrote:
| They promote their best schmoozers to management.
|
| They have so much money that competence no longer matters and
| bootlicking will get you much farther.
| throwup238 wrote:
| _> Gawd, does every company promote it 's stupidest people to
| management?_
|
| Yes: https://en.wikipedia.org/wiki/Dilbert_principle
|
| Ironically, this is the place where they can do the _least_
| damage.
| ben_w wrote:
| I don't know if GDPR (or any of its cousins) applies to you,
| but this kind of thing sounds _exactly_ like the sort of
| thing it 's supposed to outlaw.
| jokethrowaway wrote:
| How? Your bank stores personal data covered by gdpr but
| enabling crappy secure systems is not the domain of gdpr.
|
| Most likely this is caused by SCA another European
| directive that ruined our lives with extra security hoops
| (for payment providers) for little extra security - or even
| worse in case of voice password or security questions
| ben_w wrote:
| A person's voice is, I believe, personal data.
|
| > Processing personal data is generally prohibited,
| unless it is expressly allowed by law, or the data
| subject has consented to the processing
|
| - https://gdpr-info.eu/issues/consent/
| seabass-labrax wrote:
| Importantly, you can also revoke consent at any time
| under the GDPR. Unlimited consent isn't possible, so the
| bank would have to make the (dubious) claim that such
| processing did not require permission at all.
| Havoc wrote:
| Investec? Yeah thinking I need to phone them to disable mine
| cwillu wrote:
| From the github readme:
|
| "MyShell reserves the ability to detect whether an audio is
| generated by OpenVoice, no matter whether the watermark is added
| or not."
|
| Call me skeptical...
| hasty_pudding wrote:
| Holy cow! If this works without curated audio...this is amazing!
| senthilnayagam wrote:
| current leader in open source voice cloning is RVC, would like to
| see how it compares to it.
| echelon wrote:
| RVC is voice conversion (audio to audio), and it's typically
| finetuned.
|
| This is zero shot TTS. Samples create vector encodings that
| serve as input to inference. There's no retraining the model
| unless you want it to generalize or perform better.
| pclmulqdq wrote:
| Wonderful company, not a scam at all:
| https://docs.myshell.ai/tokenomics
| SubiculumCode wrote:
| whats with the crypto thing?
| programjames wrote:
| I love this paper. It reads very much like "this is what we did,
| and we want to help others do it too." Also, the section "Remark
| on Novelty" is golden: "OpenVoice does not intend to invent the
| submodules in the model structure ... The contribution of
| OpenVoice is the decoupled framework that seperates the voice
| style and language control from the tone color cloning." They
| don't try to hype up their contribution.
| jamespattn wrote:
| Can someone give me a practical use case where this adds a net
| benefit to society?
| shinycode wrote:
| Aside from the fact that is will be easier to scam people, I
| fail to see benefits. We can already translate everything with
| the same synthetic voice
| Lerc wrote:
| Unifying a voice in tutorial videos so that the difference in
| voice does not distract the learner.
|
| Auto non-toxic rephrasing of online chat in video games, let
| people hear their voice but paraphrase what they said in a
| manner that doesn't turn the platform into a cesspit.
|
| Cloning your own voice so that you can turn a script into audio
| without 50 takes and then having to remove a million Ums and
| errs.
| grayhatter wrote:
| > Auto non-toxic rephrasing of online chat in video games,
| let people hear their voice but paraphrase what they said in
| a manner that doesn't turn the platform into a cesspit.
|
| that feels very orwellian
| ben_w wrote:
| George Orwell -- 'If you want a picture of the future,
| imagine a boot stamping on a human face--for ever.'
|
| I think this is closer to the direction of Huxley in Brave
| New World, where a deeper understanding of how to
| manipulate without brute force creates a very different
| dystopian society than 1984.
| haroldp wrote:
| "Don't you see that the whole aim of Newspeak is to
| narrow the range of thought? In the end we shall make
| thoughtcrime literally impossible, because there will be
| no words in which to express it."
| ben_w wrote:
| Censorship by itself doesn't stop people thinking (or
| even expressing) forbidden thoughts, it stops a person's
| words reaching other people.
|
| BNW had a similar effect by conditioning, rather than by
| applying the strong form of the Sapir-Whorf hypothesis.
| paradox460 wrote:
| Real time translation in the speakers own voice.
| treetalker wrote:
| While listening to the examples given, I noted the cross-
| language ones. I'm eager to improve my accents in my
| nonnative languages by cloning my voice and comparing
| recordings of how I do sound with how I would sound as a
| native speaker!
| abetusk wrote:
| This is an exceptional use case!
|
| Mr Beast talked about translating his videos to other
| languages to get more reach. This can be done for people
| with limited budget or just in general so people can watch
| videos without needing subtitles.
|
| I wouldn't be surprised if we saw this incorporated into YT
| in the near future.
| diggan wrote:
| Person A used to be able to speak, but lost their voice in a
| accident/because of reason Y. Luckily, there is surviving
| audio/video with their voice on it, so a text-to-voice with
| their own voice could be created for them to use.
| goodluckchuck wrote:
| Possibly speech therapy.
|
| Certainly entertainment. Movies / TV. It opens a new
| opportunity for videogames with generative characters.
| kushie wrote:
| apple has Personal Voice for accessibility
| ldoughty wrote:
| From an indie game dev standpoint, I can probably say a
| sentence or two in a given way using my standard headset
| microphone.. and something like this would allow for clean
| voice lines fairly easily, as long as they don't need to stress
| too much emotion... But for a $0 game, that would still be
| beneficial. Imagine all the 2D Zelda/FF like games that don't
| get played today because people would rather listen to dialogue
| than read.
|
| Of course, there's also the preservation of the voice of a
| loved one. I would probably pay to hear my father's voice again
| but there"s probably only one or two VHS tapes with his voice
| on it.
| nickpsecurity wrote:
| My pastor has an injured, vocal cord that makes him sound
| gritty at times. A technology like this applied to old copies
| of his speaking might make him sound like he used to. I don't
| know if he'd use something like that since we mostly rely on
| the Spirit of Christ to open hearts to the truth.
|
| Outside public speakers, there's probably other people whose
| lost their voice or have trouble vocalizing who might want to
| sound like their old selves. This could help them.
|
| Disclaimer: I think these techs will more often do damage than
| good. I'm just brainstorming an answer to your question.
| grayhatter wrote:
| No.
|
| The real answer is yes, I could probably come up with some
| contrived examples, like I lost my voice in a freak LLM
| accident and now want to clone my old voice. But this doesn't
| (you don't?) really need a net benefit reason to figure it out
| and publish it. Because why? I assume, because "this shouldn't
| exist!" which is just a more palatable wa to phrase "won't
| someone think of the children".
|
| Society doesn't benefit from ignorance, so given it can exist,
| what's the problem with it existing? Why does it need a
| practical reason? Because people will do bad things with it?
| Duh, but I'd rather everyone know then just the bad guys
| johnnyworker wrote:
| > Why does it need a practical reason?
|
| To at least give us something as a consolation for all the
| havoc all sorts of deep fakes will wreak on societies. It's
| like asking what a knife can be used for other than murder.
| It's a valid question.
| jamespattn wrote:
| My question wasn't to imply that I don't think a given
| technology should or shouldn't exist.
|
| I was curious to see if anyone could name at the top of their
| head some practical use cases that they feel net out the
| potential harms of cloning and misusing someone else's voice.
|
| There's some nice and certainly practical examples, but I
| don't feel any of them would net out the harms.
|
| Perhaps there's a use case that we can't even comprehend yet
| that would though!
| lbrunson wrote:
| By this logic there shouldn't be regulation on anything,
| because the bad guys will have it any way.
|
| While you can't make it go away, you can disincentivize
| propagation and use which can be the difference between
| thousands of cases of scams/extortions and millions. Until
| there's a stronger argument for voice cloning models (talking
| to a dead loved one is creepy and not a positive argument)
| then we shouldn't encourage tools with overwhelmingly
| nefarious utility.
| abetusk wrote:
| James Earl Jones, presumably hedging against his eventual
| demise, has allowed his voice to be used for things like the
| Star Wars franchise [0].
|
| Small, independent film makers can now use a skeleton crew to
| voice parts.
|
| I can't imagine it would be anything other than a niche
| service, but hearing the voice and, potentially, interacting
| with a chatbot/LLM with the voice of a passed love one.
|
| This is off the top of my head. I would also guess that this
| technology is a stepping stone for other weird, interesting and
| profoundly helpful uses.
|
| [0] https://www.theverge.com/2022/9/24/23370097/darth-vader-
| jame...
| stale2002 wrote:
| Well we could just look at the obvious and existing use cases
| for text to speech stuff.
|
| Alexa, siri, and similar, are all common place.
|
| Another huge usecase would be anything to do with voice acting.
| Either in video games, cartoons, or the like.
|
| This would completely democratize voice acting material, and
| would empower anyone to be able to do this for cheap.
| mattlondon wrote:
| ... and put 99% of voice actors out of business. We'll
| eventually end up with _every_ TV show, movie, and, video
| game being voiced by Ryan Gosling and Beyonce because market
| research.
| dqv wrote:
| If you've ever done voice prompt recordings for a phone system,
| voice cloning would be super helpful for doing one off tweaks,
| especially if you have to record a bunch. Instead of
| rerecording 20 messages, which can sometimes take hours, you
| can use a clone of your own voice to make the necessary
| modifications. My friend does a lot of recordings as part of
| his job and when I showed him the Adobe voice editing preview
| he got really excited. It has the potential to make tweaks a
| lot easier, less time consuming, and reduce voice strain.
| userbinator wrote:
| You would be able to translate media into the language of your
| choice, but also retaining the original voices.
| qwertox wrote:
| And suddenly it becomes a bit weird:
|
| https://docs.myshell.ai/tokenomics
|
| Tokenomics
|
| Disclaimer: MyShell is currently in the testing phase, and the
| content of the whitepaper may be subject to change in the future.
|
| $SHELL is the token used for user incentive, governance and in-
| app utility.
|
| The total supply of $SHELL is 1,000,000,000
| diggan wrote:
| And luckily, this submission seems to be about the
| paper/technology OpenVoice, not about the company MyShell
| (whatever that is).
| qwertox wrote:
| License[0]: This repository is licensed under a Creative
| Commons Attribution-NonCommercial 4.0 International License,
| which prohibits commercial usage. MyShell reserves the
| ability to detect whether an audio is generated by OpenVoice,
| no matter whether the watermark is added or not.
|
| [0] https://github.com/myshell-ai/OpenVoice
| z991 wrote:
| I commend the authors on making this easy to try! However it
| doesn't work very well for me for general voice cloning. I read
| the first paragraph of the wikipedia page on books and had it
| generate the next sentence. It's obviously computer generated to
| my ear.
|
| Audio sample: https://storage.googleapis.com/dalle-
| party/sample.mp3
|
| Cloned voice (converted to mp3):
| https://storage.googleapis.com/dalle-party/output_en_default...
|
| All I did was install the packages with pip and then run
| "demo_part1.ipynb" with my audio sample plugged in. Ran almost
| instantly on my laptop 3070 Ti / 8GB. (Also, I admit to not
| reading the paper, I just ran the code)
| pclmulqdq wrote:
| Looking at the website and the examples, it's pretty clearly
| set up to make stylized anime voices.
| fbdab103 wrote:
| Thanks for the real example. Sounded quite generated to my ear
| as well. Wonder if it can do any better with more source
| material.
| thorum wrote:
| My experience with other tools like xtts is you really need to
| have a studio-quality voice sample to get the best results.
| amluto wrote:
| The most obvious problem to my ears is the syllable timing
| and inflection of the generated speech, and, intuitively,
| this doesn't seem like a recording quality issue. It's as if
| it did a mostly credible job of emulating the speaker trying
| to talk like a robot.
| hwillis wrote:
| The biggest trip-up is the pronunciation of
| "prototypically", and you had "typically" in your original.
| Maybe it's overfitting to a stilted proto-typically? Could
| try with a different, less similar sentence
| dijksterhuis wrote:
| > It's obviously computer generated to my ear.
|
| From the README Disclaimer This
| is an open-source implementation that approximates the
| performance of the internal voice clone technology of
| myshell.ai. The online version in myshell.ai has better 1)
| audio quality, 2) voice cloning similarity, 3) speech
| naturalness and 4) computational efficiency.
| tremarley wrote:
| Their Tokenomics page say
|
| $SHELL is the token used for user incentive, governance and in-
| app utility.
|
| The total supply of $SHELL is 1,000,000,000
|
| Team, Treasury, Advisors & Private Sale = 55% allocation
|
| Community Incentive = 40% allocation
|
| Liquidity = 5%
| monkeydust wrote:
| So I guess we could (legally) now create a voice chatbot using
| Mickey Mouse audio from Steamboat Willie?
| andylynch wrote:
| Possibly, except there is no dialogue in it.
| starwin1159 wrote:
| I hope someone can handle Cantonese one day
| RagnarD wrote:
| My first and ongoing thought is that immoral/criminal uses of
| voice cloning vastly exceed any legitimate ones.
| airstrike wrote:
| Which just means we need to build protocols around this risk,
| rather than foolishly trying to shove the genie back in the
| bottle, lest we be left with _only_ the criminal uses
| squigz wrote:
| Out of curiosity, what/how many legitimate use cases have you
| considered?
| graphe wrote:
| What of commercial uses being greater than illegitimate ones?
| YouTube will give people the ability to hear it in their own
| localized language in the author's voice.
| ijhuygft776 wrote:
| Is there some similar software that allows you to add lets say 40
| years to a voice?
| anotherevan wrote:
| Is it possible to use this (or Eleven Labs) to generate a voice
| model to plug into an Android phone's TTS?
|
| I have a friend with a paralysed larynx who is often using his
| phone or a small laptop to type in order to communicate. I know
| he would love it if it was possible to take old recordings of him
| speaking and use that to give him back "his" voice, at least in
| some small measure.
| Share6323 wrote:
| That would be awesome
___________________________________________________________________
(page generated 2024-01-01 23:00 UTC)