[HN Gopher] Show HN: Clone your voice and speak a foreign language
___________________________________________________________________
Show HN: Clone your voice and speak a foreign language
Author : _josh_meyer_
Score : 115 points
Date : 2022-01-03 20:17 UTC (2 hours ago)
(HTM) web link (coqui.ai)
(TXT) w3m dump (coqui.ai)
| alonmln wrote:
| Cool, it's impressive how much can it do with a short sample,
| although this seems like an easy way for end users to deep fake
| their friends / enemies saying something.
| tiborsaas wrote:
| I tested it with your comment: https://sndup.net/mghy/ :)
|
| It's also a new possibility to somewhat personalize the text to
| speech engines. The above example is not really close to my
| voice.
| Philip-J-Fry wrote:
| Maybe the solution is to have a randomly generated paragraph of
| text to read which expires in short amount of time. So you
| can't predict it and you don't have enough time to splice
| together a fake reading from something else.
| kdavis wrote:
| Currently we're looking at possible solutions, see for example
| here[1]. If you have suggestions, feel free to chime in!
|
| In the demo we specifically disallowed bulk uploads to hinder
| such abuses.
|
| [1] https://github.com/coqui-ai/TTS/discussions/1036
| acqbu wrote:
| Gold!
| jeroenhd wrote:
| Interesting. I like the addition of music to make sure it's not
| just a raw voice sample. The output I get seems to be a mix of a
| native speaker and my voice, because my (thick) accent is being
| filtered out.
|
| I suppose that if I ever take proper English pronunciation
| classes, I now know what to strive for.
| wombatmobile wrote:
| Awesome!
|
| How do I embed this?
| bagels wrote:
| Is there a static demo that I don't have to provide my own voice
| for?
| [deleted]
| kdavis wrote:
| We did not provide such a demo in part to hinder nefarious uses
| of the technology.
| crumpled wrote:
| Honestly, how much of a hinderance is that? A person could
| just supply a recording of another person, couldn't they?
| reubenmorais wrote:
| The project page has a bunch of pre-rendered samples and ground
| truths: https://edresson.github.io/YourTTS/
| pcarolan wrote:
| This is incredibly impressive and does a great job of capturing
| my voice. Well done!
| akeck wrote:
| Is it supposed to translate or just read with the target accent?
| For me, it's only reading the English input text with the target
| accent.
| reubenmorais wrote:
| It doesn't translate the text, you have to put in text in the
| target language. But you can record audio speaking in any
| language you want.
| [deleted]
| sxv wrote:
| My 26 second training input perhaps wasn't enough. The result
| sounded like someone else. Is the result some kind of merger of
| my voice and a native speaker's?
| reubenmorais wrote:
| Similarity depends on many factors: recording quality, which
| language you're synthesizing in (models trained on more
| speakers do better), and diversity of prosody in your
| recording. Try recording for a bit longer and "acting out" a
| bit in your tone, that tends to give me interesting results :)
| IanCal wrote:
| Very interesting! Is the music an intentional blended track or an
| artifact of generation?
| _josh_meyer_ wrote:
| very much intentional.
|
| Background music makes misuse/abuse less likely (both
| intentional and unintentional)
|
| Read more here about in our open discussion:
| https://github.com/coqui-ai/TTS/discussions/1036
| momolo wrote:
| is the model available?
| _josh_meyer_ wrote:
| Demo: https://coqui.ai Code: https://github.com/coqui-ai/tts
| Blogpost: https://coqui.ai/blog/tts/yourtts-zero-shot-text-
| synthesis-l... Paper: https://arxiv.org/abs/2112.02418
| echelon wrote:
| This is so cool! Thank you!
|
| How do y'all intend to profit (succeed as a startup) if
| you're releasing so much publicly? I'd love to see you guys
| succeed.
|
| Really great to see where some of the Mozilla TTS folks wound
| up, too.
| SwiftyBug wrote:
| I speak Brazilian Portuguese natively. I chose to record my voice
| saying a specific sentence and to "translate" it to Brazilian
| Portuguese using the exact same sentence. I was very pleased to
| find out that I became a Mineiro from the countryside, one of the
| coolest accents in Brazil!
| actually_a_dog wrote:
| You spoke Portuguese into it and it just changed your accent?
| That's kinda cool.
| reubenmorais wrote:
| The Brazilian Portuguese model is a bit of an extreme showcase
| (and thus really cool!), as it was trained on a single speaker
| (entirely recorded by the main author of the paper, Edresson
| Casanova, who's Brazilian).
|
| The fact that it can do multi-lingual voice cloning at all in
| that case is already surprising. You can find more details in
| the project page [0] and paper [1]. And here's the corpus. [2]
|
| [0] https://edresson.github.io/YourTTS/
|
| [1] https://arxiv.org/abs/2112.02418
|
| [2] https://edresson.github.io/TTS-Portuguese-Corpus/
| winter_squirrel wrote:
| ceva wrote:
| it says enter your text here ..
| kdavis wrote:
| You're free to enter any input sentence you want in the text
| box.
|
| The input sentence generally should be in the language you
| selected from the dropdown. For example, if the dropdown has
| "French" selected you could enter the text "Allons enfants de
| la Patrie, Le jour de gloire est arrive!"
|
| Clicking "Submit" then generates a TTS reading of the sentence
| you input in the language selected from the dropdown.
|
| For fun you can mix and match. In other words, select a
| language from the drop down and enter text in the text box
| _not_ in the language selected from the dropdown. (For example,
| the dropdown could have "French" selected and the sentence
| could be "O say can you see, by the dawn's early light". This
| gives interesting results, it sounds as if a native French
| speaker is speaking English.)
___________________________________________________________________
(page generated 2022-01-03 23:00 UTC)