[HN Gopher] Deep-learning text-to-speech tool for generating voi...
       ___________________________________________________________________
        
       Deep-learning text-to-speech tool for generating voices of various
       characters
        
       Author : clxxx
       Score  : 263 points
       Date   : 2021-01-06 02:36 UTC (20 hours ago)
        
 (HTM) web link (15.ai)
 (TXT) w3m dump (15.ai)
        
       | nmfisher wrote:
       | From the about section:
       | 
       | > How much does maintaining the servers cost? > It depends on the
       | amount of traffic, but the minimum baseline is around several
       | thousands of US dollars every month. This is expected as
       | inference is very GPU intensive and a sufficient number of
       | instances need to be spun up to handle thousands of requests
       | coming in every minute. Everything is paid out of pocket.
       | 
       | Wow, impressive commitment for something that's free.
        
         | mickof wrote:
         | You just sort of assume that this is correct? The person[1]
         | running this comes across as a severely unstable character,
         | that number is probably hyperbole.
         | 
         | [1] https://twitter.com/fifteenai
        
           | nmfisher wrote:
           | I've worked with deep learning models enough to know the cost
           | of running GPU inference, and if the live queue stats
           | published on the website are accurate, then thousands of
           | dollars per month is certainly plausible.
           | 
           | I have no reason to disbelieve it.
        
           | 15ai wrote:
           | Not a hyperbole - I can provide proof if you'd like.
        
             | nmfisher wrote:
             | Separate question - is this English only? It looks like you
             | can feed in phonemes but I assume this has been trained
             | with English audio.
        
           | hooloovoo_zoo wrote:
           | It seems like one could get to those numbers pretty easily
           | given the prices for GPU instances on AWS. Even just one
           | decent-sized instance would be thousands of dollars per
           | month.
        
         | vsupalov wrote:
         | Yeah, running anything related to AI involves GPU instances. An
         | alternative is to point people to using Google Colab where you
         | can get access to a GPU for free, but that's not a smooth end
         | user experience for most folks.
        
           | aisofteng wrote:
           | > running anything related to AI involves GPU instances
           | 
           | This is not true. A _lot_ of AI applications use algorithms
           | such as logistic regression or random forests and don't need
           | GPUs - partly, of course, because GPUs are so expensive and
           | these approaches are good enough (or more than good enough)
           | for many applications.
        
             | vsupalov wrote:
             | Whoops, sloppy generalization on my part. You're completely
             | right of course, thanks! I've been focusing on deep
             | learning a lot lately, to the point where AI has become an
             | alias for those exciting new GPU-heavy techniques.
        
         | calebkaiser wrote:
         | The price of GPU inference can be brutal, but there's a lot you
         | can do on the infra side to improve it:
         | 
         | - Spot instances
         | 
         | - Aggressive autoscaling
         | 
         | - Micro batching
         | 
         | Can reduce inference compute spend by huge amounts (90% is not
         | uncommon). ML, especially anything involving realtime
         | inference, is an area where effective platform engineering
         | makes a ridiculous difference even in the earliest days.
         | 
         | Source: I help maintain open source ML infra for GPU inference
         | and think about compute spend way too much
         | https://github.com/cortexlabs/cortex
        
         | Nican wrote:
         | Out of curiosity, as I have no visibility about the infra
         | actually required- but at that cost, would it not be easier to
         | just have a machine under a desk somewhere?
        
           | calebkaiser wrote:
           | Not for the kind of inference running here, I'd imagine.
           | 
           | There are few key reasons why most realtime inference is done
           | on the cloud:
           | 
           | - Scale. Deep learning models especially tend to have poor
           | latency, especially as they grow in size. As a result, you
           | need to scale up replicas to meet demand at a way lower level
           | of traffic than you do for a normal web app. At one point, AI
           | Dungeon needed over 700 servers to support just thousands of
           | concurrent players.
           | 
           | - Cost. Related to the above, GPUs are really expensive to
           | buy. A g4dn.xlarge instance (the most popular AWS EC2
           | instance for GPU inference) is $0.526/hour on demand. To hit
           | $3,000 per month in spend, you'd need to be running ~8 of
           | them 24/7. Prices vary with purchasing GPUs, but you could
           | expect 8 NVIDIA T4's to run around $20,000 at minimum, plus
           | the cost of other components and maintainence. To be clear,
           | that's very conservative--it's unlikely you'll get consistent
           | traffic. What's more likely is you'll have some periods of
           | very little traffic where you need one or two GPUs, and other
           | high load periods where you'll need 10+.
           | 
           | 3. Less universal of an issue, but the cloud gives you much
           | better access to chips at lower switching costs. If NVIDIA
           | releases a new GPU that's even better for inference,
           | switching to it (once its available on your cloud) will be a
           | tweak in your YAML. If you ever switch to ASICs like AWS's
           | Inferentia or GCP's TPUs, which in many cases give way better
           | performance and economics than GPUs, you'll also naturally
           | have to be on their cloud.
           | 
           | However, there is a lot that can be done to lower the cost of
           | inference even in the cloud. I listed some things in a
           | comment higher up, but basically, there are some assumptions
           | you can make with inference that allow you to optimize pretty
           | hard on instance price and autoscaling behavior.
        
       | code51 wrote:
       | I'm fearing this will end up with a massive debt on their part.
        
       | Meph504 wrote:
       | seriously fuck anyone that is putting in forced time delays on
       | their terms, how about you let me read what it is you are doing
       | before requiring shit like this.
        
         | duckmysick wrote:
         | If you don't agree with the terms, including how they are
         | presented to you, you can always reject them and leave the
         | site.
        
       | atum47 wrote:
       | While you're typing the word the text box don't show it, when you
       | complete the word then it shows on the text box. Brave, Android.
       | 
       | Besides that, amazing results. Congratulations.
        
       | bravura wrote:
       | 15ai, do you mind talking a bit about the methods you are using?
        
       | uberman wrote:
       | This was amazing!
        
       | mvts wrote:
       | Nice work on the Gordon Freeman Voice :D
        
       | danShumway wrote:
       | I don't usually expect much from demos like this, but I'm kind of
       | surprised how impressive the results currently are. They're
       | definitely not perfect, you're definitely getting some odd
       | clipping and noise, but this shows a large amount of promise.
       | 
       | Being able to generate voices for games would enable a lot of
       | interesting indie projects. IMO people should be paying more
       | attention the market implications of products like this than to
       | the social implications. There are a lot of projects that just
       | aren't really feasible right now that could be if this kind of
       | technology was more polished and generally available for
       | commercial/self-hosted use. And in those cases, you don't even
       | need to do inference, makers will likely be willing to mark up
       | their scripts themselves.
       | 
       | Anyway I digress. Congrats, this is really cool!
        
         | Pfhreak wrote:
         | > people should be paying more attention the market
         | implications of products like this than to the social
         | implications.
         | 
         | People will absolutely suffer harm from this tech, but hey,
         | think about the dollars that could be made! No, we should
         | absolutely be paying more attention to the social implications.
        
           | C19is20 wrote:
           | Musicians Union?
        
           | danShumway wrote:
           | Eh, this technology currently falls very squarely into the
           | category of "almost good enough that I could use it for a
           | creative project, but not _nearly_ good enough that you 're
           | going to be able to convince me that the results aren't
           | generated."
           | 
           | I'm not primarily interested about the dollars, I'm
           | interested in allowing communities to do creative things. I
           | think people are looking at this tech like it's only going to
           | be used for deepfakes, and they're underestimating the extent
           | it's going to be used to create voice-acted game mods,
           | animations, anonymization tools, and other creative/helpful
           | projects.
           | 
           | If you're really worried about this stuff though, you can
           | take some comfort in the fact that by far the worst examples
           | on the site are of real-world voices. This is currently
           | technology that as far as I can see is far more suited for
           | generating new voices or voicing cartoon characters with
           | well-defined patterns/inflections than it is for imitating
           | the president.
        
             | Pfhreak wrote:
             | You are looking at the current implementation and not
             | thinking about the implication.
             | 
             | One, this tech absolutely could be used to fool someone.
             | Not everyone will be listening with a critical ear. Played
             | back over a phone or injecting a phrase or two in otherwise
             | spoken samples will fool many people.
             | 
             | I guarantee you someone will be using this to make their
             | own MLP episodes on YouTube specifically designed to scare
             | children or get them to do awful things.
             | 
             | Models presumably get better over time. It really won't be
             | too much longer until people will be able to fake
             | celebrities, politicians, exes, authority figures, etc. As
             | a fairly benign example, if I had this in high school you
             | better believe I could have called to excuse some of my
             | absences.
             | 
             | I agree, I love the idea of generating some decent voice
             | lines for my own games projects, but this also introduces
             | issues of the rights of the original voice actors.
             | 
             | If you train a model to mimic a performance given by an
             | actor, then use that model and fire the actor, isn't that
             | potentially really problematic? (Also, it draws parallels
             | to the Luddites who were not anti technology, but wanted to
             | ensure that technology wasn't used in a way that reduced
             | worker quality of life.)
             | 
             | And yes, I think there are helpful ways this could be
             | deployed. I'm gender fluid, and I'd love to be able to
             | adjust my voice digitally, but we need to be thinking about
             | how this could cause harm first.
        
               | visarga wrote:
               | I am thinking it could be used to impersonate someone in
               | a phone call to a family member for conning.
        
               | danShumway wrote:
               | > One, this tech absolutely could be used to fool
               | someone.
               | 
               | The problem I have here is that it's already not hard to
               | fool people. I don't think it's feasible for us to say
               | that we're going to put something that could be highly
               | beneficial on hold just because we don't want to deal
               | with social education efforts that we kind of already
               | need to tackle anyway. Per your example, if we get rid of
               | deepfakes, it's not clear to me that Youtube is going to
               | be any more safe. I already would not allow a child to
               | browse Youtube unattended, people already generate the
               | videos you're talking about.
               | 
               | And I know that people are putting this in a different
               | category than general CGI, voice modulation, or consumer-
               | grade apps like Photoshop. I'm not going to argue that
               | it's necessarily wrong for people to be worried, but no
               | matter how many times people tell me that this is
               | fundamentally different, I still have not seen any
               | serious evidence that this technology is going to be more
               | dangerous than Photoshop, and I think it's going to be
               | way easier to detect than a decent Photoshop job is.
               | Photoshop's content-aware paste/fill tools are better
               | than this example, and they arguably require less work to
               | use.
               | 
               | And again... I'm sympathetic to concerns about moving too
               | fast, but I just don't think there's any world, even if
               | you could get rid of deepfakes entirely, where we don't
               | need to be worried about media literacy and general
               | skepticism. If people today don't realize that voices can
               | already be convincingly faked, then that's a really
               | serious problem, and if democratizing that ability causes
               | society in general to become more aware of the potential
               | of disinformation, then honestly that might even be a
               | good thing that we should be encouraging.
               | 
               | So sure, concerns, but in my mind people are focusing on
               | one particular implication that I don't think is
               | particularly likely, and ignoring that responding to that
               | concern is probably going to look the same no matter what
               | our position on deepfakes is.
               | 
               | > If you train a model to mimic a performance given by an
               | actor, then use that model and fire the actor, isn't that
               | potentially really problematic?
               | 
               | I think that's a very complicated question. I would not
               | assume that the loss of work for voice actors, who can
               | shift into voice generation roles, is going to be a big
               | enough downside that it overrules the upside of allowing
               | ordinary people to start generating their own vtube
               | avatars or commenting on and building on top of existing
               | culture.
        
               | Ajedi32 wrote:
               | > If people today don't realize that voices can already
               | be convincingly faked, then that's a really serious
               | problem, and if democratizing that ability causes society
               | in general to become more aware of the potential of
               | disinformation, then honestly that might even be a good
               | thing that we should be encouraging.
               | 
               | I've wondered about that angle as well. You can't put the
               | genie back in the bottle, so maybe the best way to combat
               | the threat of deepfaked misinformation is actually to
               | take the opposite approach and make it as easy as
               | possible for normal people to generate their own
               | deepfakes; that way it becomes common knowledge that such
               | things are possible (similar to how photoshop is common
               | knowledge today).
        
               | Erlich_Bachman wrote:
               | > If you train a model to mimic a performance given by an
               | actor, then use that model and fire the actor, isn't that
               | potentially really problematic?
               | 
               | And if you have to keep getting a person paid for
               | something that a machine could do with (assuming, as per
               | your post) 100% equal performance, that is not
               | problematic? When the voice becomes as good as real
               | actors, then yes of course they should become out of a
               | job. Just like progress has been going on for thousands
               | of years.
        
             | bawolff wrote:
             | It really doesnt have to be perfect to trick someone.
             | You're expecting this site to be fake so you're listening
             | carefully. If you weren't expecting anything and you were
             | in the middle of a busy day at work, you are much much less
             | likely to notice any discripencies.
             | 
             | We already have stories like https://www.forbes.com/sites/j
             | essedamiani/2019/09/03/a-voice...
             | 
             | That said, as far as harms go, i dont think this is all
             | that bad that it should preclude creative uses of this
             | technology.
        
             | significant5 wrote:
             | I might be misunderstanding you, but there are no real-
             | world voices on the site? All of them are of characters.
        
               | danShumway wrote:
               | I see a pretty linear drop in quality from Glados to
               | Spongebob to Twilight Sparkle to the narrator from
               | Stanley Parable to the 10th Doctor.
               | 
               | It seems to struggle more and more as the voices get less
               | cartoony/exaggerated.
        
               | significant5 wrote:
               | I'm not too sure about that. From my testing, Fluttershy,
               | Applejack, Twilight, Chrysalis, Rise, and Kyu (and a
               | bunch of other characters that I'm surely forgetting)
               | seem to perform phenomenally well. Especially Chrysalis,
               | her emotions are extremely believable, and
               | Fluttershy/Applejack/Rise/Kyu have almost zero noise for
               | every generation. This might be the most impressive site
               | I've ever seen.
               | 
               | Oh, I somehow forgot all of the TF2 characters. Some of
               | them do struggle (Medic the most, I think) but everyone
               | else seems incredibly good.
               | 
               | And the Daria characters, too. Honestly, the vast
               | majority of characters are already near-perfect.
        
               | danShumway wrote:
               | Hrm. Well, I can't really argue with that beyond that my
               | standards on perfect might be different.
               | 
               | I think some of the best voices they have are characters
               | like Twilight, she shows a ton of promise. But as it
               | stands right now, I would still at least hesitate to use
               | Twilight's voice in a project unless I didn't have other
               | options. Chrysalis's voice is good, but again, is an
               | exaggerated cartoon character with a large amount of
               | inflection. I would not use her voice in her current
               | state without a lot of post-processing. Someone like the
               | Spy I would consider to be unusable, it sounds to me like
               | the character needs to clear their throat or something,
               | it's got a lot of strange artifacts. I definitely would
               | consider the 10th Doctor unusable, even for just a hobby
               | project or a voice assistant.
               | 
               | But... I don't know, maybe this is subjective. I can't
               | just tell you that what you're hearing is wrong, if you
               | like the results then you like the results :)
               | 
               | And again, I don't want to detract from how impressive
               | they are. They are incredibly impressive, particularly
               | because of how characters like Chrysalis emote. Extremely
               | promising. But I still think there's a difference between
               | 'impressive' and 'believable deepfake'.
        
               | significant5 wrote:
               | Yeah, that's fair. I dunno, I can't really hear anything
               | wrong with Fluttershy or Applejack no matter how hard I
               | try, but your ears are probably much better than mine :p
               | 
               | I've been seeing quite a few skits being posted on /r/tf2
               | (https://www.reddit.com/r/tf2/comments/kr374q/honestly_id
               | k_i_...) and all of the voices sound pretty much perfect
               | to me. But as you said, it's subjective.
        
         | Ajedi32 wrote:
         | I wonder if there are any legal concerns with using the voices
         | of well known characters/actors like this in a commercial
         | context.
        
           | danShumway wrote:
           | I don't _think_ a voice can be copyrighted, but IANAL so you
           | shouldn 't bank on that.
           | 
           | If a voice could be copyrighted, or if this was a trademark
           | issue or something, I strongly suspect that this site would
           | _not_ fall under fair use regardless of whether or not it was
           | commercial. But again, IANAL, so I don 't feel confident
           | making any kind of strong claim about that either.
        
             | dragonwriter wrote:
             | > I don't think a voice can be copyrighted, but IANAL so
             | you shouldn't bank on that.
             | 
             | The audio content (which includes voices) of the source
             | work is copyrighted, and a mechanical transform of that
             | work (which deep learning to mimic the voices clearly is)
             | would seem to be a derivative in at least the literal
             | sense.
        
               | thrill wrote:
               | IANAL and I would say no. Anyone is free to imitate any
               | else. A machine doesn't make that different. It would be
               | a violation to claim you were someone else while doing
               | the imitation.
        
       | Baeocystin wrote:
       | The fact that you included Chell as a voice choice (and
       | 'generated' a null audio clip to boot) earns a chuckle. The
       | quality of the voices across the board earns wide eyes and an
       | eyebrow raise. Thanks for sharing this, it's remarkable work.
        
         | high_byte wrote:
         | _GLaDOS_ hahahaha this is just... perfect. _Stanley Parable
         | Narrator_ funny you should mention this.
        
       | demonictoaster wrote:
       | The security implications of this kind of tech are scary. Going
       | forward it will become really easy to reproduce the voice of
       | anyone! It seems not a lot of training data is required to
       | achieve reasonable results (e.g. Spong Bob is just 27min of
       | voice, Half Life Black Mesa Announcer is just 1.9min!!). This
       | stuff could be easily leveraged for scams and deep fakes (along
       | with deep learning models that could also tweak lip movements to
       | match the voice for example). Thankfully, there is also a very
       | active area of research that leverages similar tech to detect
       | deep fakes.
        
         | dschooh wrote:
         | These kinds of discussions are common with articles about deep
         | fake video and audio. While I do not disagree with your point,
         | here are two quick thoughts:
         | 
         | - We have had perfect image manipulation capabilities for quite
         | some time now. We have had written text manipulation
         | capabilities for hundreds of years.
         | 
         | - People will continue to believe what they believe, whether
         | there is deep fake video and audio or not.
        
           | demonictoaster wrote:
           | Agree with you. Hopefully people are more and more aware that
           | they cannot trust anything out there. We are soon reaching a
           | point where we can make anyone say anything we want,
           | including in audio and video format.
        
         | spyder wrote:
         | It's already happening:
         | 
         |  _A Voice Deepfake Was Used To Scam A CEO Out Of $243,000_ :
         | 
         | https://www.forbes.com/sites/jessedamiani/2019/09/03/a-voice...
        
       | vsupalov wrote:
       | The results are really impressive. At the moment I'm considering
       | spending a low 3-figure amount for a professionally spoken intro
       | for a new podcast. Some of the lines I generated are in my top 5
       | easily, human speakers don't have a lot of edge for short generic
       | blurbs of text anymore it seems.
        
       | SV_BubbleTime wrote:
       | Is the author being cute putting Chell from Portal and Freeman
       | from Half-life in there, and then there is no audio? It would be
       | a weird oversight if not intentional because the author is
       | clearly familiar with Valve games.
        
       | trowngon wrote:
       | Are there open source projects like this?
        
         | CookieAnon wrote:
         | I have CookieTTS where I reseach lots of experimental stuff.
         | (You can see my credits on the 'Thanks' section of 15.ai)
         | 
         | I can get about 90% of the quality of 15.ai currently. I think
         | I could surpass 15.ai but not without some help.
        
       | EugeneOZ wrote:
       | Please give me a hint how to control the speed - Portal:Wheatly
       | is too fast for me.
       | 
       | Amazing toy! Thanks for "download" link, I'm creating a
       | collection of GlaDOS phrases now.
        
       | mensetmanusman wrote:
       | As Alexa and Siri have improved over the last couple years and
       | gotten a more human voice, it has been interesting observing my
       | young children (1-4) interact with such devices.
       | 
       | There is definitely a sense of 'who is that' coming from their
       | little minds that they are sometimes quite perplexed about. 'It's
       | a computer' is starting to feel like a cop-out answer as these
       | things improve...
        
       | MartinoPalmitos wrote:
       | Half-Life's Gordon Freeman voice is really spot-on!
        
       | kebman wrote:
       | Pretty cool! I tried it with this small dialogue, and then edited
       | together two voices in Reaper from the downloads:
       | 
       | Bob: "Hello, John."
       | 
       | John: "Oh, hello there, Bob."
       | 
       | Bob: "Yes, hello. It's what I said. Why do you keep repeating
       | what I say, John?"
       | 
       | John: "I didn't repeat you! I merely said hello, you dimwit!"
       | 
       | Bob: "There you go, being condescending again. Fuck you!"
       | 
       | John: "What? You're the one who started it!"
       | 
       | Try it yourself, or write something different. Either way, good
       | fun!
        
       | twangist wrote:
       | I get nothing but "Error code 422: Server error", even on input
       | "Hello", in FF, Safari and Chrome.
        
         | durdn wrote:
         | You may need to choose a "Source" in the top left. I got the
         | same error before choosing a character.
        
       | centimeter wrote:
       | This is extremely impressive.
       | 
       | I wonder if this will lead to a resurgence of "moon man" style
       | videos with well-known characters rapping extremely offensive
       | lyrics.
        
         | [deleted]
        
       | SommaRaikkonen wrote:
       | Welp, after messing around with a few voices I was completely
       | impressed with Glados's. This is really cool because I have no
       | idea how the character's voice was synthesized, but apparently ML
       | can do it for me so props to that.
        
         | smrq wrote:
         | I'm pretty sure the real Glados voice effect is mostly pitch
         | correction and formant shifting. You can do it with Melodyne at
         | least (which, to be fair, is also computer magic-- just a
         | different kind than this one!)
         | 
         | I just found a video on YT with an example of recreating this
         | in Melodyne: https://youtu.be/1oQn66gvwKA
        
           | jsheard wrote:
           | If I remember correctly from the Portal developer commentary
           | they did use voice synthesis, but only as a precursor.
           | 
           | They used basic text-to-speech to read out the script then
           | had the voice actress imitate the weird intonation of the TTS
           | reading.
        
         | giantrobot wrote:
         | GLaDOS was voiced by a real person [0]. Her voice had some
         | effects added but mostly just her trying to sound like a
         | computer.
         | 
         | [0] http://ellenmclain.net/
        
         | aksss wrote:
         | My favorite is Carl Butananadilewski, but I just ended up
         | making him say actual phrases from ATHF in the end. Was hoping
         | to see Meatwad as a character option.
        
       | pure-struggle wrote:
       | will this be open source eventually?
        
         | pure-struggle wrote:
         | https://twitter.com/fifteenai/status/1342304487474606081
         | 
         | found an answer.
         | 
         | "There's no point in releasing a poorly done model, and to do
         | so for the sake of popularity would be despicable. My goal is
         | to achieve indistinguishability, which I certainly know is
         | possible. Anything short of near-perfection is unacceptable. "
        
           | scrollaway wrote:
           | Megalomania, always a great excuse.
           | 
           | AI and ML users are massively benefiting from open source but
           | too often refuse to release their data. It's like we're back
           | in the middle ages and alchemy is back in style.
        
             | hooloovoo_zoo wrote:
             | Judging by how the model and site are put together, I think
             | this is some software engineer's hobby project. Not wanting
             | to spill their secrets doesn't make them a megalomaniac for
             | the same reason being a magician doesn't make one a
             | megalomaniac.
        
               | scrollaway wrote:
               | Except magicians do actually share their secrets; there
               | is an active trade around it, conferences, discussions
               | and lots of reading material available. The barrier of
               | entry is higher than any old open source project but it's
               | not inaccessible and comparable to alchemy.
               | 
               | I was talking about ML in general, not just this project.
               | See OpenAI and their latest release for example: no
               | public product, no trained model. Just alchemy.
        
           | 15ai wrote:
           | I'm afraid this tweet is taken out of context. I had written
           | this in response to complaints about the release date being
           | delayed because I wanted to make sure that the released model
           | (that is currently on the site) was the best it could be.
           | 
           | I do plan to compile and publish my findings in the future,
           | but nothing is set in stone yet. I know that the model can be
           | improved even further, and I'd prefer to be as comprehensive
           | as possible.
        
           | whatshisface wrote:
           | Releasing a poorly done intermediate result would give either
           | competitors or colleagues a leg up in the race, depending on
           | whether one sees them as competitors or colleagues.
        
       | suyash wrote:
       | fun but what are the legal implications of using these voices for
       | projects? Does the license cover the use of these voices?
        
       | Roritharr wrote:
       | I'm pretty happy with the results I get. I've toyed around with a
       | similar goal, but with the idea of approaching voice actors to
       | give them a powerful tool to sell a "low quality" version of
       | their voice in bulk. That way an up and coming author could use a
       | tool like this and some elbow grease to create an Audiobook with
       | famous voices.
        
       | hmate9 wrote:
       | It's incredible how little data is required for amazing output!
       | Only a couple of minutes of talking needed.
       | 
       | You can find a couple of minutes of taking of anyone, so the
       | security implications are huge!
        
       | superasn wrote:
       | Really impressive. Do you plan to implement an API like Amazon,
       | Google that lets you generate TTS for price?
        
         | wongarsu wrote:
         | I too think that this has potential as a cloud TTS service.
         | However that does open up all the moral and legal cans of worms
         | around this. I could imagine some of the voice actresses not
         | being very happy about somebody else commercializing their
         | voice without their consent.
         | 
         | The obvious way to get around this is to keep this as the
         | showcase and to pay some people to add their voices to the paid
         | version. I imagine this would sell just based on being decent
         | TTS with a wide range of voices, even when people don't know
         | the voices offered.
        
       | dnsiseuzb wrote:
       | How does this compare to wellsaid labs?
        
       | st1x7 wrote:
       | You should really see what happens when you click reject on their
       | terms and conditions prompt.
        
       | bailey1541 wrote:
       | If it doesn't work on mobile why bother sharing?
        
         | clxxx wrote:
         | It works on mobile for me. Tried it on both safari and chrome
         | on an iPhone running iOS 14.
        
       | rkagerer wrote:
       | One of the voice actors is John de Lancie!
       | 
       | https://soundcloud.com/user-860705643/q-pandemic-rant-no-mus...
        
       | junon wrote:
       | I'm rarely impressed by demos like this. This is a clear
       | exception.
       | 
       | Not only that, but the creator seems cool and down to earth.
       | Thanks for sharing, this is incredible work.
        
       ___________________________________________________________________
       (page generated 2021-01-06 23:04 UTC)