[HN Gopher] 15.ai
___________________________________________________________________
15.ai
Author : memorable
Score : 221 points
Date : 2022-06-12 03:33 UTC (19 hours ago)
(HTM) web link (15.ai)
(TXT) w3m dump (15.ai)
| hojjat12000 wrote:
| They named their tts "Deep Throat"? Why would you?
| layer8 wrote:
| Maybe they're seeing a need for text-to-speech in the porn
| market?
| Bytewave81 wrote:
| They knew.
| latenightcoding wrote:
| bronnies
| xdennis wrote:
| It could be a reference to
| https://en.wikipedia.org/wiki/Deep_Throat_(Watergate)
| mgdlbp wrote:
| to the Deep _Foo_ pattern in deep learning naming, more
| likely.
| BeFlatXIII wrote:
| Why not both at once?
| droidist2 wrote:
| Which itself was a reference to the pornographic film of the
| same name.
|
| https://en.wikipedia.org/wiki/Deep_Throat_(film)
| userbinator wrote:
| Relatedly, a speech synth (or rather, the "output" part) that
| has appeared on HN before is named the Pink Trombone:
|
| https://news.ycombinator.com/item?id=18912628
| quenix wrote:
| Perhaps as a joke?
| lagrange77 wrote:
| I only get white noise after trying several inputs. Alignment
| Confidence > 80%
| [deleted]
| armchairhacker wrote:
| This is really cool. It's a text-to-speech and the gist seems to
| be that they synthesize it from only a little audio.
|
| The results are clearly synthetic and need work. However what's
| cool is that there are a ton of characters (from popular shows
| and video games) and there are useful statistics like inferred
| emotion (which is also in the output).
|
| Honestly it's a big problem how a lot of AIs are like "black
| boxes" where you really can't customize or see anything. Yeah we
| have DALL-E and GPT which can generate text images but the lack
| of customization or fine-tuning the image afterwards severely
| hinders what's possible with them. Ultimately what you want is
| something interactive, where you can control how much or little
| the AI generates, and give it really specific criterion.
|
| But seriously: how did you get the domain `15.ai`?
| [deleted]
| jamal-kumar wrote:
| I just used it to make spongebob squarepants say bad things.
| BuyMyBitcoins wrote:
| This thing synthesizes dolphin squeaks? Wow!
| Der_Einzige wrote:
| In the case of text generation, we call this "Constrained Text
| Generation" and it is an active field of research. Without
| going into too many details (I have a paper out for review
| about this), it's pretty trivial to get "interactive control
| over how much or how little the AI generates" by a combination
| of filters on the LMs vocabulary, and effective selection of
| the various hyperparamaters in the decoder (top_p, top_k,
| temperature)...
| userbinator wrote:
| I agree this is an amazing demonstration of what AI can do, but
| I think that the current method of "learn and repeat" that
| depends on having tons of computing resources available is
| still too inefficient in many ways. Personally I'm more
| interested in what parameterisable formant-based synths can do,
| since they are extremely efficient and can produce a
| theoretically infinite variation of voices, although the output
| quality is still not great. Example:
| https://news.ycombinator.com/item?id=31604299
| teaearlgraycold wrote:
| You can fine tune GPT-3
| canjobear wrote:
| Only if you're OpenAI
| jameshart wrote:
| Fine tuning of GPT3 models is available via their public
| API. Costs credits, and you need to get their permission to
| use it in an actual application, but it's not locked in a
| lab.
| sillysaurusx wrote:
| So "Only if you're OpenAI" :)
|
| If the weights were public, the community would figure
| out a way to fine tune it.
| jameshart wrote:
| It's not a matter of 'figuring out'. The model supports
| fine tuning. It's a core feature of the openai API.
| Running 'fine tuned' versions of GPT-3 that are created
| by customers is literally their SaaS model. They have
| examples in the documentation. Here:
| https://help.openai.com/en/articles/5528730-fine-tuning-
| a-cl...
| Dangeranger wrote:
| GPT-3 can be quite adaptive given prompt engineering and
| the uploading of sample files.
|
| Have you used GPT-3 with any of the methods mentioned in
| the docs?
|
| I've seen that GPT-3 can produce quite starkly different
| results when prompted differently and when samples have
| been uploaded.
| [deleted]
| Deritio wrote:
| Dall-E 2 has customization.
|
| You can remove or add things etc.
|
| And for GPT you can also specify more details.
|
| Only a question of time until you can work with the ai on your
| art/thing.
|
| There are ai models which keep track of context and others
| which generate a plan of actions.
|
| AI is not a blackbox
| sterlind wrote:
| OpenAI itself is a black box. Until I can reproduce their
| models or download them myself, and have unfettered access to
| them, it's just gatekept magic behind an API. So much for
| democratizing machine learning.
| judge2020 wrote:
| > So much for democratizing machine learning.
|
| Unless this is a recent change, their mission isn't that:
|
| > OpenAI's mission is to ensure that artificial general
| intelligence (AGI)--by which we mean highly autonomous
| systems that outperform humans at most economically
| valuable work--benefits all of humanity.
|
| https://openai.com/about/
| marcofatica wrote:
| > But seriously: how did you get the domain `15.ai`?
|
| it's an MIT project so I'm sure that was a factor
| paulsutter wrote:
| .ai domains cost a couple hundred bucks a year so domains are
| very available / not widely used by domain squatters (Its the
| country domain for the island of Anguilla, pop 15,000)
| vehemenz wrote:
| That's a lot of SpongeBob and My Little Pony characters. At this
| point, is it fair to say the attachment to kids' cartoons is a
| cultural (or pathological) phenomenon for under 30s?
| eljimmy wrote:
| This is unrelated but what's with the fascination with HN users
| and My Little Pony? I've noticed this on a lot of posts in the
| past few months.
| canjobear wrote:
| A lot of people in tech circles have a sexual fixation on the
| show and its characters.
| BeFlatXIII wrote:
| It's a good thing they're warehoused in cities and
| apartments, then.
| jeroenhd wrote:
| Aside from the causal brony references, this project originally
| featured a lot of my little pony voices because it needed
| meticulously annotated transcriptions of the input audio to be
| trained well.
|
| The extremely dedicated brony subculture voluntarily put in a
| lot of work to get a corpus for the AI to learn from.
|
| There's also another factor at play: this AI works best with
| highly pitched voices, which my little pony is just full of.
| Not only did MLP provide such a generous source of training
| data, its results were also much more impressive than the dry
| dictation many other corpi would've resulted in, adding to its
| fame.
|
| I personally haven't seen any significant rise in MLP
| references, though that could be because I don't know the show
| so I don't catch references to it. It's also very possible that
| you've caught the Baader-Meinhof phenomenon.
| crooked-v wrote:
| It's basically the same as unironic appreciation of various
| child-targeted-but-adult-friendly 'slice of life' anime, just
| more incongruous-seeming because of the 'pony' thing.
| smoldesu wrote:
| I mean, 15.ai started as a 4chan project for /mlp/ users to
| generate voice lines from official voice actors now that
| Friendship is Magic is over (google Pony Preservation Project).
| Honestly, the _more_ impressive part is that a bunch of
| nobodies on an imageboard leapfrogged the rest of the industry
| and made a now-famous voice transformer model.
|
| In the greater sense, though? Ponies have always been this
| weird relic of internet absurdity and bear-baiting. Some people
| rep it ironically, other people are dead-serious, but the
| community has significant overlap with the STEM field. As a
| result, a lot of pony-related stuff would end up propagating
| into the tech world, much like this very project.
| loves_mangoes wrote:
| A lot of people in or around tech are furries, are into things
| like japanese animation, or are into My Little Pony. I don't
| consider myself one, but people often jokingly say that furries
| run the Internet.
|
| And it's not really specific to HN. For instance you have well-
| known people in the community who do vaccine R&D, or
| cryptography, or contribute to the C/C++ standards at ISO, or
| several other STEM things that are pretty outspoken about their
| interests.
|
| This is made more obvious on Twitter, where people tend to blur
| their personal and work identities a lot.
| Der_Einzige wrote:
| My ML professor at the university I went to was also weirdly
| obsessed with MLP.
|
| Weeaboo/furry data scientists are always ahead of the industry
| - I seem to recall an effective decensoring model that was
| called "DeepCreamPy" and had almost 10K github stars before it
| was nuked and rehosted.
|
| I'm convinced that learning Statistics is in a zero-sum game
| with social skills.
| btown wrote:
| https://en.wikipedia.org/wiki/My_Little_Pony:_Friendship_Is_...
| explains in detail - between 2010 and ~2015 there was a massive
| overlap between millennial geek culture and unironic fandom of
| the rebooted My Little Pony show, especially among millennial
| men. One dedicated fan hub averaged almost 400k page views per
| day over its first 3.5 years of existence. And throughout it
| all, programming projects abounded, such as the delightful
| FiM++ esoteric language (https://esolangs.org/wiki/FiM%2B%2B)
| styled after the show's framing device. For many in tech now,
| it was an inescapable part of internet culture of the early
| 2010s, and a fond memory for many.
| jonas21 wrote:
| One of my favorite examples from that era:
|
| https://pjreddie.com/static/Redmon%20Resume.pdf
|
| And in case you were wondering what this little pony did
| next...
|
| https://scholar.google.com/citations?user=TDk_NfkAAAAJ&hl=en
| Der_Einzige wrote:
| Wait, the guy who wrote darknet IS THE SAME GUY WHO DID
| THIS RESUME?
|
| AHHHHHHHHH
| [deleted]
| drblue wrote:
| Friendship is Magic was a legitimately good show. (Or at least
| Season 1 and 2 were).
| nope96 wrote:
| Oh god, 50 shades of SpongePants. The future is wild in ways I
| never imagined. Star Trek style holodecks in what, 15 years?
|
| So, creepy thought: should we be recording audio of our parents,
| so we can still "hear from them" once in a while after they die?
| People are going to want to reconstruct their lost loved ones
| with AI. This project seems to imply you only need an hour or so
| of audio.
| batch12 wrote:
| After my dad died, we found that he had recorded every phone
| call he had with us. I thought about doing this combined with
| text generation to create plausible prompts but never got the
| guts to go through with it. He wouldn't care if I had done it,
| but it wouldn't ease the guilt from years of sighs and rolling
| my eyes when he called at always the wrong times.
| WalterGR wrote:
| If anyone is curious, the previous submission of this was
| popular: https://news.ycombinator.com/item?id=25654118
| convery wrote:
| Interesting how it seems like there's little correlation between
| source sample-size and quality. e.g. the Portal Sentry turret at
| 1.5min input vs the 100+ minutes of the narrator from Stanly
| Parable which sounded like auto-tune had a stroke.
| jeroenhd wrote:
| The AI seems to work best on high-pitched, female voices. The
| model seems to have improved in this regard since I last tried
| this website, but it's still very significantly biased towards
| female voices it seems.
| crooked-v wrote:
| Much of it depends on refinement work on each specific model.
| Try the Daria voices, for example, which are easy to get
| results with that sound like they came straight out of the
| show.
| darkerside wrote:
| Unfortunately, I guess I've reached the stage of my life where
| there are only three choices I actually would recognize out of
| the entire selection
| _gabe_ wrote:
| > All code and models used for this website were written and
| trained as part of my research at the Massachusetts Institute of
| Technology (MIT). The code and models are privately owned and are
| not to be sold or distributed for unauthorized use.
|
| Does anybody else find the irony in this statement absolutely
| amazing lol.
| belter wrote:
| https://tlo.mit.edu/learn-about-intellectual-
| property/owners....
|
| "...MIT owns inventions made or created by MIT faculty,
| students, staff, and others participating in sponsored research
| projects or in MIT programs using significant MIT funds or
| facilities or those inventions developed pursuant to a written
| agreement with MIT..."
|
| I got RickRolled as soon as arriving to the page. :-)
| ntoskrnl wrote:
| The Chell voice from Portal is extremely accurate
| BeFlatXIII wrote:
| How does she compare to the Gordon Freeman model?
| blooalien wrote:
| 100% accurate to be precise. ;)
| deeplearner1 wrote:
| If you want more information about 15.ai, I highly suggest
| reading their Wikipedia article!
| https://en.wikipedia.org/wiki/15.ai
|
| The whole history behind the project is fascinating: 4chan had a
| huge role in its development, and the project's work was stolen
| by an NFT company that a famous voice actor endorsed not too long
| ago.
| julianeon wrote:
| Ah, I was wondering why they were so concerned about
| attribution.
|
| The truth is that, today, if I was going to use a tool to
| generate voices (say for YouTube), I wouldn't necessarily pick
| a small SaaS tool. I'd use Amazon Polly or some other GCP-style
| platform voice creation tool. There are already a few products
| in the space, and their costs are so low as to be almost
| negligible (example: Polly, 5 million characters free). For a
| commercial project, I could probably stay on a free tier for a
| whole year.
|
| With Dall*E, it seems like the only option, and it's such a
| superior option that a website could abuse it for commercial
| profits. But for voice synthesis, it's already dirt cheap and
| commercially available without limitations.
| quickthrower2 wrote:
| What is the tldr. Got a wall of terms of service I didn't want to
| agree too and clicking reject was a Rickroll.
| forrestthewoods wrote:
| The copyright laws around this are fascinating. They're adamant
| it must be non-commercial, they must be credited, and it can't be
| mixed with any other generated content. Meanwhile their content
| is exclusively derived from popular commercial products. Oh and
| they also make money via Patreon donations.
|
| I dunno. Feels a little gross to me. Eventually there is going to
| be a big copyright case about a model trained with copyrighted
| material. I have no idea how that will be resolved. Or maybe
| there will simply be new laws passed to make it either explicitly
| ok or explicitly not ok.
| deeplearner1 wrote:
| "Make money"? The creator loses several thousands of dollars a
| month hosting the site, and it's done for free. The Patreon
| donations are all voluntary and only offer a pittance to the
| developer.
|
| I highly suggest reading into the project first. The Wiki
| article I linked before (https://en.wikipedia.org/wiki/15.ai)
| answers all of your questions about copyright infringement.
| jason2323 wrote:
| Hah! If you click on reject on the cookies window it rickrolls[1]
| you!
|
| [1]https://www.urbandictionary.com/define.php?term=Rick%20Roll
| capelio wrote:
| Except that wasn't a cookies acceptance window...
| quickthrower2 wrote:
| Those 2 comments sum up the web in 2022
| layer8 wrote:
| Well that is one shitty ToS dialog.
| [deleted]
| claviska wrote:
| I appreciate the intent, and I understand that many people will
| do the wrong thing so this was probably an attempt to get such
| folks to actually read and adhere to the TOS, but the obnoxious
| consent dialog with a mandatory countdown turned me off. It's
| probably not effective, either.
|
| On desktop, maybe I'd open dev tools and remove it. On mobile, I
| won't be bothered. I hate that this is what the web has become
| and I choose to simply miss out on websites that behave this way.
| sophiebits wrote:
| Weird, I read through the text because I care about how I'm
| allowed to use the things people are giving me - and by the
| time I got to the Accept button, it was enabled.
| darkerside wrote:
| I just want you to know that it was absolutely hilarious to
| hear (the first half of) this read in the voice of SpongeBob
| SquarePants.
| s-xyz wrote:
| The DeepThroat model? Sounds familiar...
___________________________________________________________________
(page generated 2022-06-12 23:00 UTC)