Post AJLbKfGPD2Sx6LgWMi by blackfire@fosstodon.org
 (DIR) More posts by blackfire@fosstodon.org
 (DIR) Post #AJLZylf0e5D2cFBiUq by mike@fosstodon.org
       2022-05-11T15:44:43Z
       
       0 likes, 0 repeats
       
       #Mimic3 is finally available for beta testing. It's not a "fully open" test and they send you a link with a generic user/pass combo after you register your email. I assume this is to limit bot traffic and gauge interest. Still looking forward to the #Mycroft #Mark2 in September, and more than likely I'll be using the new @popey voice. I'm still using the original ap voice on my Mark1, and it wouldn't feel natural to use anything else.https://mycroft.ai/blog/mimic-3-preview/
       
 (DIR) Post #AJLa6ja8G2pmUAhnTk by doomsdayrs@fosstodon.org
       2022-05-11T15:46:10Z
       
       0 likes, 0 repeats
       
       @mike @popey I just got the emailDid you time this to overload the site ;p ?
       
 (DIR) Post #AJLanGNuggv4NuGm2a by mike@fosstodon.org
       2022-05-11T15:53:52Z
       
       0 likes, 0 repeats
       
       @doomsdayrs I would love it if the site were overloaded! Not because I want the site to fail, but I'd really like to see some solid interest in VUI products by the open source community. Google and Amazon are running rampant all over this market segment and letting a private company basically wire your house for monitoring seems like a really horrible idea to me. We desperately need a viable open source option here.
       
 (DIR) Post #AJLb54LcBD3gGTgWjQ by popey@mastodon.social
       2022-05-11T15:49:48Z
       
       0 likes, 0 repeats
       
       @doomsdayrs @mike I too got the mail! I don't have much time to play with it. But I'd like to see what others do with it. Maybe the old blame popey site could be revived :)
       
 (DIR) Post #AJLb54eP3M5TCkTX16 by mike@fosstodon.org
       2022-05-11T15:57:04Z
       
       0 likes, 0 repeats
       
       @popey So far, #Mimic3 seems really good. High quality TTS. When the AP voice tells me "The weather in Phoenix is ^$@%^@ing hot", I almost believe you're the one saying it. Some of the English US voices have an almost Indian accent to them, which is probably something that needs fixing, but overall the 15-20 minutes I've spent with it have been really impressive. Can't install it locally yet, but the fact that you can have a single Mimic3 instance for all your speakers is cool.@doomsdayrs
       
 (DIR) Post #AJLbKfGPD2Sx6LgWMi by blackfire@fosstodon.org
       2022-05-11T15:49:34Z
       
       0 likes, 0 repeats
       
       @doomsdayrs @mike @popey if they actually get this working properly offline this is going to make a lot of people happy
       
 (DIR) Post #AJLbKftkqjwR4O5whM by mike@fosstodon.org
       2022-05-11T15:59:51Z
       
       0 likes, 0 repeats
       
       @blackfire For sure. The biggest obstacle to fully offline Mycroft is just the fact that getting it setup requires several PhDs and a week of time. You can run the backend offline with Selene and configure your speakers to use it, but it's not something that they advertise and it's not something that's a checkbox type config. You really have to dig into the software to get that working. I'd love to see it made easy enough for a person off the street to do it.@doomsdayrs @popey
       
 (DIR) Post #AJLjeu1WqBZ6Itt5MW by shom@fosstodon.org
       2022-05-11T17:33:12Z
       
       0 likes, 0 repeats
       
       @mike thanks for sharing. I listened across three languages and it's quite natural sounding. Some voices are better than others (provided feedback). Did anyone play with the prosody tag? It inserted a pause only in one language (I reported that too) but overall very impressive. @popey
       
 (DIR) Post #AJLkL4JIz8ygN0JBz6 by mike@fosstodon.org
       2022-05-11T17:40:48Z
       
       0 likes, 0 repeats
       
       @shom I wasn't super interested in the prosody tag, but I did toy with break times and changing the voice midstream. That did seem to work for everything that I tried, but I only had an American voice speaking a partial phrase and get interrupted by @popey during a predefined pause in the text. Seemed to work perfectly, but everything was English because that's the only language I can claim to speak. Even that I'm not sure I can claim fluency in.
       
 (DIR) Post #AJLl9yXcC7fFFJHBS4 by mike@fosstodon.org
       2022-05-11T17:50:01Z
       
       0 likes, 0 repeats
       
       @shom I have noticed that adding the prosody volume changes adds pauses to the output too. I assume that's not intentional? I put this in as the input:<prosody volume="50%">Hola, mi nombre es Miguel. <break time="500ms" />Hablo mal espaƱol.</prosody><voice name="en_UK/apope_low#default"><prosody volume="50%">Thank</prosody>god<prosody volume="50%">for that</prosody></voice>The results are not good. The Spanish is trashed, and the English pauses for volume changes.@popey
       
 (DIR) Post #AJLlGVwQvTA0ZZ0Drc by mike@fosstodon.org
       2022-05-11T17:51:12Z
       
       0 likes, 0 repeats
       
       @shom Oh, I was using es_ES/m-ailabs_low#karen_savage for the voice config. @popey
       
 (DIR) Post #AJLoI4kTIcLAg1qNEG by shom@fosstodon.org
       2022-05-11T18:25:06Z
       
       0 likes, 0 repeats
       
       @mike interesting! I didn't try volume, only rate. It sounded mostly fine (any slowed/sped-up speech sounds a bit weird) but the pause was jarring. @popey
       
 (DIR) Post #AJLpIQX0PYg8BnHCyW by blackfire@fosstodon.org
       2022-05-11T18:36:22Z
       
       0 likes, 0 repeats
       
       @mikeI didnt even know you could get the backend running locally. I might have to give it a go. Have you seen any below PhD level guides in the wild?@doomsdayrs @popey
       
 (DIR) Post #AJLpuhp8ZCB2pD0hBw by mike@fosstodon.org
       2022-05-11T18:43:16Z
       
       0 likes, 0 repeats
       
       @blackfire Nothing 3rd party, and I'm being more than a little hyperbolic, but there are instructions on the Github page:https://github.com/MycroftAI/selene-backendTheoretically with the #Selene backend on prem, any #Mycroft instances pointed to that backend, and a local #Mimic3 install, you can run the whole stack privately and offline. I've never done the whole stack, and Mimic3 is still so new that you have to request the files manually, but I'd love to see this working.@doomsdayrs @popey
       
 (DIR) Post #AJLskYHRUgb2C26Et6 by mike@fosstodon.org
       2022-05-11T19:15:04Z
       
       0 likes, 0 repeats
       
       @shom I got this from Mike Hansen over at #Mycroft (paraphrased). Good information to have moving forward IMHO. MH: Thanks for the feedback. This is indeed an issue with the current SSML processing pipeline -- prosody tags are best used for full sentences at the moment. The pauses are due to a few different features colliding: automatic punctuation is added when the prosody tags divide up the sentence, so Mimic 3 sees each text chunk is being marked as a complete utterance.@popey
       
 (DIR) Post #AJLui27cHs7RmhQkHA by shom@fosstodon.org
       2022-05-11T19:37:01Z
       
       0 likes, 0 repeats
       
       @mike thanks for explaining that. It's a beta and known issues are expected. It'd be nice to put that in the info panel on where the tags are described. @popey