Post AbqffJAkq0dLAi2Fu4 by ericflo@mastodon.xyz
(DIR) More posts by ericflo@mastodon.xyz
(DIR) Post #AbqdOnPOy1pe7HswCm by simon@fedi.simonwillison.net
2023-11-16T00:55:13Z
0 likes, 0 repeats
Some notes on cloning my voice using ElevenLabs, which turned out to be shockingly quick and easy: a single 10 minute audio sample was all it needed https://til.simonwillison.net/misc/voice-cloning
(DIR) Post #Abqdmm25jbjenbPooq by webology@mastodon.social
2023-11-16T00:59:20Z
0 likes, 0 repeats
@simon I don't know that I can tell the difference in your normal voice and your cloned voice. Can you tell the difference?
(DIR) Post #AbqeAh6zgLpUUG7Aqu by simon@fedi.simonwillison.net
2023-11-16T01:03:53Z
0 likes, 0 repeats
@webology I can, yes - as can Natalie - but we are the world's foremost experts on that specific voice!
(DIR) Post #AbqemKs8oglqW57Gk4 by webology@mastodon.social
2023-11-16T01:10:49Z
0 likes, 0 repeats
@simon since most people think hearing their voice sounds weird, does it have a similar effect? I can see this being more or less that to people based on that sensation. 🤔
(DIR) Post #AbqffJAkq0dLAi2Fu4 by ericflo@mastodon.xyz
2023-11-16T01:20:33Z
0 likes, 0 repeats
@simon ElevenLabs reeeeeally benefits from how nervous all the big companies are to release actually good open models for TTS. It's frustrating to me that the technology exists for years in labs (VALL-E etc) but nobody wants to open source it.
(DIR) Post #Abqhq2Z182FQkqd4uO by pieq@floss.social
2023-11-16T01:44:49Z
0 likes, 0 repeats
@simon This is really impressive. I am not a native English speaker, nor am I an expert of your voice, and I really cannot tell that the English spoken article is generated.What I find amusing is that the Spanish translation really does sound like an Englishman speaking Spanish :) I guess this might be of some interest to linguists to better understand the mechanisms of language learning and things like that.Thanks for sharing!
(DIR) Post #Abqka0AwXYUCZoPeQS by LauraLangdon@hachyderm.io
2023-11-16T02:15:18Z
0 likes, 0 repeats
@simon Interesting! To my ear the difference in timbre is an instant and quite distinct difference, like even if I didn’t speak the language and couldn’t pick out that the accent is different, the timbre is a dead giveaway.
(DIR) Post #AbqyDKcNlfXWfyKbWi by iw@hachyderm.io
2023-11-16T04:48:07Z
0 likes, 0 repeats
@simon Neat! Are you aware of any open source tools to generate my own synthesized voice on my own machine? I'm a bit skeptical about sharing my voice with a vendor.
(DIR) Post #Abr4SwVvkllyHE8mwq by simon@fedi.simonwillison.net
2023-11-16T05:58:42Z
0 likes, 0 repeats
@iw I've not tried it but apparently there are a bunch these days
(DIR) Post #AbrDJK7O2QLSAw9H4i by raul@ruby.social
2023-11-16T07:37:19Z
0 likes, 0 repeats
@simon If you're curious about it, the Spanish translation has a thick English (not sure if British?) accent but is totally understandable, as if I was talking to a tourist who's fluent in Spanish.Probably the biggest mistake is in "envoltorio" (it puts the accent in the "i", as in "envoltorĂo", which isn't correct).It's probably been trained with Latin American voices (I'm from Spain, so I can't tell if there's a specific country's accent).Very, very impressive.
(DIR) Post #AbrWPhQUBs2PKYpt7g by akuchling@dmv.community
2023-11-16T11:11:29Z
0 likes, 0 repeats
@simon Goodbye to dubbing voiceover artists, eventually. With this you could have Tom Cruise's dubbed voice speak Spanish or French or German with a fairly natural accent.
(DIR) Post #Abs2V0kWyHl7F3mwKm by iw@hachyderm.io
2023-11-16T17:10:59Z
0 likes, 0 repeats
@simon thx! I found one on hugging face and its github repo available on colab: https://github.com/KevinWang676/Bark-Voice-Cloning
(DIR) Post #AbvM0FeU5yY32dRxvk by moof@cupoftea.social
2023-11-18T07:33:24Z
0 likes, 0 repeats
@simon I’m particularly intrigued about the Spanish (I’m a native Spanish speaker). “Your” accent is noticeably anglosphere, with some mistakes only English speakers make (like aspirating the h in “Ahora”), but also definitely sounds like you learned Mexican Spanish, due to the intonation, lilt, and some of the phonemes, rather than Spain’s Spanish which is what you would have done at GCSE level if you had tried that.It also devolves into uncanny valley, that last paragraph is almost devoid of tone and whilst the first couple of paragraphs sound like your normal bouncy presentation style, the last paragraph is downright creepy but not quite verging into robotic. It breaks the illusion that the reader understands what they are saying.It’s not there…yet. But it’s coming. And it’s not far off at all.Are you going to use this to add a an audio option for your blog, get your posts read out for you?
(DIR) Post #AbziEQFEU3llZXdmy0 by brucelawson@social.vivaldi.net
2023-11-20T10:01:34Z
0 likes, 0 repeats
@simon I find the generated Simon ("51m0n"?) indistinguishable from your training sample.