[HN Gopher] Weird A.I. Yankovic: a cursed deep dive into the wor...
       ___________________________________________________________________
        
       Weird A.I. Yankovic: a cursed deep dive into the world of voice
       cloning
        
       Author : waxpancake
       Score  : 198 points
       Date   : 2023-10-02 15:15 UTC (7 hours ago)
        
 (HTM) web link (waxy.org)
 (TXT) w3m dump (waxy.org)
        
       | distantsounds wrote:
       | The sampled voices sound neither like Michael Jackson nor Weird
       | Al. A good effort, but a professional impersonator could likely
       | do better on either front.
        
         | code_runner wrote:
         | I know what you mean. Its more noticeable (imo) on the Michael
         | one.... but its definitely in there. I think the pitch
         | correction is to blame for a bit of the weirdness.
        
         | hinkley wrote:
         | The best Michael Jackson interpreter in a town of 50,000 could
         | do better than this. It's... this is bad.
        
         | nemo44x wrote:
         | It sounds like Weird Al trying to be Michael Jackson trying to
         | be Weird Al.
        
           | Reventlov wrote:
           | As a non native speaker, it does sound a bit like Michael
           | Jackson imo...
        
             | hinkley wrote:
             | Sometimes I'll watch a movie with voiceover work, where
             | some character has a very specific accent, and I'll be
             | watching along for twenty minutes and the VA will let slip
             | just a couple syllables of their real voice and my ears
             | will prick up and I'll think, hey I know this guy. Isn't
             | that... oh the guy from the thing. From <wrong movie>, no
             | wait I mean <other movie>? Yes, it is.
             | 
             | That's what this sounds like. Five syllables of Michael
             | Jackson while he's trying to be Action Hero or Big Villain,
             | or Funny Sidekick (a problem Eddie Murphy has never had,
             | all evidence from Coming to America notwithstanding).
        
       | mecredis wrote:
       | It's kind of wild that these tools just transfer a copy of these
       | models every time they're spun up (whether it's to a Google Colab
       | notebook or a local machine.)
       | 
       | This must mean Hugging Face's bandwidth bill must be crazy, or am
       | I missing something (maybe they have a peering agreement? heavily
       | caching things?)
        
         | pdntspa wrote:
         | I really wish I could configure this crap to cache somewhere
         | other than my C: drive
         | 
         | Or better yet, how about asking me where I want to store my
         | models?
        
           | callalex wrote:
           | I haven't used windows in a while but I thought it supported
           | some form of cross-volume symlink? Or at least mounting an
           | image stored on another volume to an arbitrary path.
        
             | RajT88 wrote:
             | Links in windows are a thing, but not well known. I must
             | have been using Windows for close to 20 years before I
             | realized they were in there.
             | 
             | https://learn.microsoft.com/en-
             | us/windows/win32/fileio/hard-...
             | 
             | https://learn.microsoft.com/en-us/windows-
             | server/administrat...
        
               | jimmaswell wrote:
               | mklink /d on windows has saved me many times.
        
         | jonluca wrote:
         | You can do a lot of these fully locally with things like RVC
         | web ui or https://tryreplay.io/
        
           | tr33house wrote:
           | wish they had something for Linux
        
         | satertek wrote:
         | Their Python module caches the downloads, which is checked
         | before downloading them again...but you're probably not wrong
         | on the crazy bandwidth bill. Looks like they have crazy VC
         | money though, considering the current climate.
        
           | minimaxir wrote:
           | The Colab notebooks are a fresh and independent session with
           | no caching.
        
             | infinityio wrote:
             | Google might cache further up the chain, which could help
        
         | civilitty wrote:
         | Unmetered 10+ gigabit connections were on the order of
         | $1/mbit/mo wholesale over a decade ago when I priced out a
         | custom CDN so for the cost of 100 TB of data transfer out of
         | AWS you could get a 24/7 sustained 10gbit/s (>3 PB per month at
         | 100% utilization).
         | 
         | Bandwidth has always been crazy cheap.
        
           | hotnfresh wrote:
           | Not all connections are created equal. Even some big
           | providers clearly have iffy peering agreements upstream
           | that'll manifest as terrible performance if you have a
           | widely-geographically-distributed bandwidth-heavy load.
        
           | colechristensen wrote:
           | Indeed. If you're not using a cloud provider bandwidth is
           | extremely cheap.
           | 
           | In fact locally I can get a 10 gbps _home internet_ unmetered
           | connection for $300 /mo.
           | 
           | I'm not sure how they'd react if I transferred 1 PB/mo though
           | :)
        
             | tomrod wrote:
             | Is my math wrong here? 10 gbps -> 8s per 10 GB -> 800s per
             | 1TB -> 80,000s per 1PB -> 22.3 hrs at full speed for 1 PB?
        
               | NavinF wrote:
               | If you search "1pib/(10 gbps)" on google, you'll get 10.4
               | days.
               | 
               | An unmetered 10G port at a US data center is ~$1500/mo.
               | Not particularly expensive
        
           | morkalork wrote:
           | If you host copies of your data with a few big providers
           | could you do something smart like detect and redirect
           | requests from AWS to an S3 bucket and not pay for bandwidth
           | leaving the provider?
        
       | Calamitous wrote:
       | Key takeaway:
       | 
       | > No current artificial intelligence is powerful enough to hide
       | the weirdness of Weird Al.
        
         | [deleted]
        
       | hinkley wrote:
       | > Artifacts aside, it sounds like Michael Jackson doing a Weird
       | Al impression?! Every line has a distinctly "white and nerdy"
       | vibe: it loses any seriousness and edge, exaggerating words for
       | comic effect and enunciating lyrics really clearly so the
       | punchlines can be heard.
       | 
       | No, it sounds like someone doing doing an impression of Weird Al
       | doing an impression of Michael Jackson. Someone whose mom told
       | them they were special and they believed it.
       | 
       | These examples are standing on a ridge line, surveying the
       | uncanny valley and looking for the best way to cross.
        
         | [deleted]
        
         | blagie wrote:
         | ... they're good enough.
         | 
         | I have an accent. If not for that, I'd be a great presenter.
         | 
         | If I could translate my voice into a poor Neil deGrasse Tyson,
         | a poor Patrick Steward, a poor Carl Sagan, a poor Morgan
         | Freeman, etc., my presentations would be... better.
        
       | satvikpendem wrote:
       | What's the best open source text to speech? Eleven Labs and
       | others are interesting but closed source. I want to use them
       | mainly for audiobooks as I have a lot of ePubs and I'm just using
       | the basic Google text to speech voices on my Android, via Moon+
       | Reader. It works fine but it's still more robotic than state of
       | the art.
        
         | modeless wrote:
         | I've tried a few, not an expert, but I think Coqui's new XTTS
         | models are decent performance and quality wise (just in terms
         | of how the speech sounds, can't speak to the voice cloning
         | fidelity as I don't care about that). Open source code but non-
         | commercial license for the model. They also have a bunch of
         | models with more permissive licenses that aren't as good.
         | 
         | I doubt they're better than Google's TTS though.
        
         | lhl wrote:
         | For neutral sounding very fast/efficient voices, I find Coqui
         | TTS VITS models to be very good. For slower, more expressive
         | voice or voice cloning I think the Coqui TTS XTTS is good (or
         | you can look at the mrq/tortoise-tts).
         | 
         | I'm still awaiting a StyleTTS2 implementation. The audio
         | samples sound top notch: https://styletts2.github.io/
        
         | NoMoreNicksLeft wrote:
         | We bought the $300/month plan for a few months earlier this
         | year... and you'd only get 40 hours of audio generation for
         | that. It wasn't really sufficient to our needs.
         | 
         | How many audio books is 40 hours?
         | 
         | Also, while its voice cloning was truly amazing, every once in
         | awhile the voice would get a little nutty and sound like an
         | insect just flew down their throat, or maybe they had an LSD
         | flashback. Normal normal normal then it's some Bobcat
         | Goldthwaite skit. And if you dialed down that parameter (I
         | think it's called stability?) then it goes monotone really
         | quickly.
         | 
         | We're probably several years out from it being something people
         | use personally for audio books.
        
           | dylan604 wrote:
           | >How many audio books is 40 hours?
           | 
           | Are you reading War & Peace or Cat In The Hat?
        
             | Jeff_Brown wrote:
             | I always assume 200.to 250 pages per book when someone
             | talks about large quantities of books.
        
               | satvikpendem wrote:
               | That's fairly short. I read about 100 books a year and it
               | includes thousand page tomes like The Count of Monte
               | Cristo.
        
               | dylan604 wrote:
               | I always assumed that book to be rather short since it
               | just needs to be a number of sandwiches eaten.
               | 
               | 100 books/year. That's an impressive feat regardless the
               | number of pages. Are these downloaded ebooks or physical
               | printed copies of books?
        
               | satvikpendem wrote:
               | It's mostly audiobooks, I have some ePubs that don't have
               | audiobooks anywhere, such as many Japanese light novel
               | fan (or official) translations into English for example.
               | I can get through them as I can understand audio faster
               | than I can read text, as I play back at 3 to 5x speed.
        
               | dylan604 wrote:
               | what's your retention/comprehension of the content at
               | those speeds? i find that those speeds allows me to
               | understand the concept as it's whizzing by, but the
               | retention of it is not good. everything i've ever been
               | taught and personal experience about long term retention
               | all say speed is not the most conducive.
        
               | satvikpendem wrote:
               | Retention is pretty good but that's because I've been
               | training myself for the past 5 to 10 years to get to that
               | speed. It's similar to how blind people's TTS are
               | incomprehensible to most hearing-able people.
        
             | NoMoreNicksLeft wrote:
             | I like to read with my eyes, not listen. I honestly have no
             | idea how long an audio book is, hours-wise.
             | 
             | I've seen a few for download, and they're always like
             | hundreds of meg, if not over a gig. And that's in mp3,
             | where it should be compressed heavily.
        
               | squeaky-clean wrote:
               | In my audible library, the shortest is the first
               | Hitchhiker's Guide to the Galaxy a 5h51m. The longest is
               | The Power Broker at 66h9m. Most of the books I have are
               | in the 15-25 hour range, but I also have a lot of fantasy
               | stuff that gets near 50 hours (Game of Thrones, Brandon
               | Sanderson...).
        
         | ticulatedspline wrote:
         | Bark seems pretty good
         | 
         | https://github.com/suno-ai/bark Demo at
         | https://huggingface.co/spaces/suno/bark
         | 
         | In the couple samples I tried it was substantially better at
         | picking up meaning compared to VALL-E-X
        
         | follower wrote:
         | > What's the best open source text to speech?
         | 
         | I haven't re-evaluated OSS TTS options for a few months but
         | from my own experience earlier in the year I've been pleased
         | with the results I've gotten from Piper:
         | 
         | * https://github.com/rhasspy/piper
         | 
         | I've primarily used it with the LibriTTS-based voices due to
         | their license but if it's for personal local use you can
         | probably use some of the other even higher quality voices.
         | 
         | The official samples are here: https://rhasspy.github.io/piper-
         | samples/
         | 
         | Here's a small number of pre-rendered samples I've used that
         | were generated from a WIP Piper port of my Dialogue Tool[0]
         | project: https://rancidbacon.gitlab.io/piper-tts-demos/
         | 
         | While it's not perfect & output quality varies for a number of
         | reasons, I've been using it because it's MIT licensed & there's
         | multiple diverse voice options with licenses that suit my
         | purposes.
         | 
         | (Piper and its predecessors Larynx & Mimic3 are _significantly_
         | ahead of where other FLOSS options had been up until their
         | existence in terms of quality.)
         | 
         | [0] https://rancidbacon.itch.io/dialogue-tool-for-larynx-text-
         | to...
         | 
         | ----
         | 
         | Edit to add links to some of my notes related to FLOSS TTS, in
         | case they're of interest:
         | 
         | *
         | https://gitlab.com/RancidBacon/notes_public/-/blob/main/note...
         | 
         | *
         | https://gitlab.com/RancidBacon/notes_public/-/blob/main/note...
         | 
         | *
         | https://gitlab.com/RancidBacon/notes_public/-/blob/main/note...
        
         | entrepy123 wrote:
         | POST-EDIT, CORRECTED ANSWER
         | 
         | I doubt it's currently actually "the best open source text to
         | speech", but the answer I came up with when throwing a couple
         | of hours at the problem some months ago was "ttsprech" [3].
         | 
         | Following the guide, it was pretty trivial to make the model
         | render my sample text in about 100 English "voices" (many of
         | which were similar to each other, and in varying quality).
         | Sampling those, I got about 10 that were pretty "good". And
         | maybe 6 that were the "best ones" (very natural, not annoying
         | to listen to, actually sounded like a person by and large), and
         | maybe 2 made the top (as in, a tossup for the most listenable,
         | all factors considered).
         | 
         | IIRC, the license was free for noncommercial use only. I'm not
         | sure exactly "how open source" they are, but it was simple to
         | install the dependencies and write the basic Python to try it
         | out; I had to write a for loop to try all the voices like I
         | wanted. I ended using something else for the project for other
         | reasons, but this could still be a fairly good backup option
         | for some use cases, IMO.
         | 
         | PRE-EDIT, ERRONEOUS ANSWER
         | 
         | Same as above, but I had said "Silero" [0, 1, 2] originally,
         | which I started trying out too, before switching to a third
         | (less open) option.                 [0]
         | https://github.com/snakers4/silero-models#text-to-speech
         | [1] https://silero.ai       [2]
         | https://github.com/snakers4/silero-models#standalone-use
         | [3] https://github.com/Grumbel/ttsprech#usage
        
         | artninja1988 wrote:
         | Would also like to know this. Can't seem to find an open source
         | tts engine that works on mobile to read muh books
        
         | [deleted]
        
       | smath wrote:
       | Related article from 1 year ago on Darth Vader's voice being AI
       | generated going forward:
       | 
       | https://arstechnica.com/information-technology/2022/09/james...
        
       | mckirk wrote:
       | My absolute favorite application of this tech so far is The Beach
       | Boys singing 'Hurt'. It's the first time I seriously didn't
       | notice any artifacts, and it just works so well even though it
       | really shouldn't.
       | 
       | Enjoy: https://youtu.be/gmNSFqyg_Z8
        
         | code_runner wrote:
         | This account is one of the absolute top tier creators for weird
         | music mixes. The recent deep faking stuff has been shockingly
         | good. I think this is a good example of an "acceptable" use of
         | AI, as long as artists/composers etc rights are all settled.
         | 
         | its always more fun when its a real group of talented people
         | being silly, but I'd listen to an album of weird mashup like
         | this for sure.
        
         | dwringer wrote:
         | I don't know what I was expecting but that isn't Hurt, it's
         | Surfin' USA with Hurt's lyrics that sound extremely jittery and
         | grainy.
         | 
         | I'm curious though if some AI soon could in fact synthesize the
         | Beach Boys' style with the actual chords and melody from the
         | NIN song, possibly with some of the pathos of Johnny Cash as
         | well.
        
           | darkerside wrote:
           | Yeah, I hate it to the point of being personally offended. It
           | has nothing to do with Johnny Cash's rendition. I'd probably
           | feel a bit better, but not much, if it were advertised as a
           | NIN mashup.
        
             | aidenn0 wrote:
             | NIN and Cash have equal billing on that video. Many people
             | might only know Cash's rendition...
        
             | cm2012 wrote:
             | It's definitely in the realm of "soulless"
        
             | mock-possum wrote:
             | Yeah that's kind of the theme of the YouTube channel - I
             | think it's hilarious honestly, but maybe you have to go
             | into it knowing what to expect.
        
               | darkerside wrote:
               | Yeah, based on the parent, and the genius of the
               | musicians involved, I was expecting something more than
               | the sum of its parts. Hurt is an incredibly powerful
               | song, and the Cash rendition imbues it with another
               | beautiful layer.
               | 
               | As a joke, I can see it being funny, but it was a jarring
               | way to experience it.
        
           | legitster wrote:
           | I agree. The "x words over y music" can be fun, but isn't
           | really impressive as a true genre parody.
           | 
           | The one that always comes to mind for me is this video of an
           | Eminem interview done from scratch as a Talking Heads song:
           | https://www.youtube.com/watch?v=Kfl3N9nesRg
           | 
           | This is potentially something that generative AI _could_ be
           | good at doing (at least recreating vocals), but this parody
           | of the Talking Heads required a lot of _very clever_ insight
           | into what made a good Talking Heads song and returned a
           | convincing and novel melody. And I think we are still a ways
           | off.
        
             | adamesque wrote:
             | Yeah, Nick Lutsko is super super funny and a very talented
             | musician. That's hard to replicate.
        
           | sumtechguy wrote:
           | The one I found fun was the matrix ice ice baby mashup. That
           | was sort of janky but good enough to be fun.
        
         | hinkley wrote:
         | The graininess of the recording covers over a lot of potential
         | problems. But given that this attempt keeps the Beach Boy's
         | tempo and enunciation, I think this technique, whatever it is,
         | would make a much more compelling version of Michael Jackson
         | covering Eat It.
        
         | nsbk wrote:
         | That hurt
        
       | minimaxir wrote:
       | This article only covers the musical aspects of AI voice cloning,
       | but there's another dynamic to AI voice cloning that's more
       | complicated: replacing general voice actors in movies/video
       | games/anime (example: https://www.axios.com/2023/07/24/ai-voice-
       | actors-victoria-at... )
       | 
       | Unlike musicians who can't be replaced without significant
       | postprocessing, have enough money to not be impacted by
       | competition, and have legal muscle, voice over artists:
       | 
       | - Can be reproduced with good-enough results from out-of-the-box
       | voice cloning settings on ElevenLabs or an open source equivalent
       | (Bark, VALL-E X)
       | 
       | - Are already underpaid for their work as-is
       | 
       | - Have no legal ownership of their voice since they are
       | contractors, and their voicework is owned by their clients who
       | may not be as incentivised in protecting the VO.
       | 
       | I want to write a blog post about it but I suspect most people on
       | Hacker News won't be interested in a treatise on the cultural
       | impacts of the voicework in Persona 5 and Genshin Impact.
        
         | ImprobableTruth wrote:
         | Voices are uncopyrightable, but impersonation isn't legal (see
         | Midler v. Ford, for a notable case), so I don't think the
         | situation is totally clear.
        
           | deepsun wrote:
           | > voice actors are fearing that the ability for generative AI
           | to replicate their voices may cost them work
           | 
           | I'm not sure how to feel about that. I'm against the idea
           | that some people "deserve" being paid for being lucky born
           | with an interesting voice.
           | 
           | On the other hand, the world always worked like that. And,
           | say, hard-working farmer or doctor were also lucky being born
           | with necessary traits to make for their living, while others
           | weren't.
        
             | minimaxir wrote:
             | Voice acting is more than just talking into a microphone.
             | It's a skill not limited to the quality of voice.
        
               | deepsun wrote:
               | A lot of skills are not simple, but computers have taken
               | over them anyways. For example, financial bookkeeping is
               | not just writing and storing the books, it's a
               | professional skill with many tricks to learn. However,
               | databases and spreadsheets have taken the major part from
               | those jobs. Same could be said about programmers who
               | learned the skill of programming Assembly language. Or
               | performing -- vinyl records and CDs has largely taken
               | over orchestras and traveling musicians.
               | 
               | I would vote for it only if it somehow encouraged voice
               | actors to experiment and create new interesting styles.
               | Kinda like patents were designed to do -- encourage
               | inventors (although recently it became controversial in
               | IT world).
        
               | [deleted]
        
               | vunderba wrote:
               | You could have made that argument more effectively in the
               | past when voice actors had to be able to mimic multiple
               | voices (Dan Castlenetta, Mel Blanc, etc.). Nowadays,
               | we're seeing more and more shows where the voices of the
               | characters are just... the normal voice of the voice
               | actor.
               | 
               | Of course it's not totally devoid of skill, you need to
               | be able to emote, inflect, and convey emotion, but the
               | bar is _far far_ lower.
        
           | lazide wrote:
           | As long as they don't claim the voice is the original actor
           | (misspell the name perhaps, or the Hollywood classic 'based
           | on'), they won't be impersonating no?
        
             | gs17 wrote:
             | The Ford ad didn't say it was Midler, they just implied it
             | by using her song with a soundalike. There was another
             | similar case with a parody ruled as impersonation. I don't
             | think there's good precedent for exactly where that line is
             | drawn.
        
           | sofixa wrote:
           | It's always funny to me when people cite old American case
           | law and try to wrangle their heads around how that can apply
           | to a situation which the case's participants couldn't have
           | possibly imagined. Shouldn't the correct way to do this be
           | new legislation being created after consulting interest
           | groups to answer the modern problems which exist due to
           | modern realities, like what the EU is doing? It seems much
           | more sensible of an approach instead of wondering how a 15th
           | century ruling's ruler would have applied his thinking about
           | something they couldn't even dream of.
        
             | lazide wrote:
             | Interest groups == lobbyists in this case. Which might
             | explain some of the American hesitation.
        
         | zerojames wrote:
         | I am interested! You should write about what you find
         | interesting; never worry if it will interest a particular
         | group.
        
         | EGreg wrote:
         | Please do. Some of us critique capitalism
        
         | foobarian wrote:
         | It saddens me because of how much impact they had on my family
         | as we played through the story line in Genshin and immersed in
         | the world. At some point we met a few of the voice actors at a
         | convention and they were like stars to us, while I'm sure their
         | circumstances are as you describe.
        
         | GuB-42 wrote:
         | Interesting note: many Vocaloids (most notably Hatsune Miku)
         | are sampled from voice actors rather than singers.
         | 
         | Singers didn't want software clones, but voices actors are fair
         | game.
        
         | sumtechguy wrote:
         | What I find interesting is this aspect that eventually, these
         | companies will hire some college kids who needs a couple
         | thousand bucks and a free pizza. Have them read the right
         | scripts. Sign the right 'give everything away' contract and
         | just forever use their voice. Or do it sneaky. Have a voice
         | assistant and in your ToS 'we can use a copy of your voice for
         | anything'.
         | 
         | The existing voice actors will be just out of work. There will
         | be a small cadre of groups that want real voice. But for some
         | projects that will not be that important.
         | 
         | Its going to get crazy.
        
           | hiccuphippo wrote:
           | Mozilla has a voice data project where people already do it
           | for free(dom) ;)
           | 
           | https://commonvoice.mozilla.org/en
        
           | HappyDaoDude wrote:
           | I have said this will initially be sold as a feature on
           | things like Audiobooks.
           | 
           | Pick your book, pick your reader and away it goes. The Diary
           | of Anne Frank read by Gilbert Gottfried.
        
           | Legend2440 wrote:
           | They don't need that - they already have enough data to
           | generate plausibly human voices that don't sound like anyone
           | in particular.
           | 
           | Voice cloning is a special case, these models are equally
           | good at making new voices.
        
           | minimaxir wrote:
           | Recent voice models by OpenAI, Meta, and ElevenLabs all state
           | upfront they work with paid professional voice actors, so
           | this space will get intetesting fast.
        
         | aaroninsf wrote:
         | <raises hand> I am
        
         | supriyo-biswas wrote:
         | HN isn't the only community to write for. While most people
         | here seem to be unsympathetic to such job concerns,
         | unconventional articles do hit the front page from time to
         | time.
         | 
         | I'd like to read it, in any case.
        
           | pixl97 wrote:
           | The get rich at any cost type like to post on these articles
           | at a higher rate I think. When you read a larger and broad
           | range of HN posts you see a substantial part of the
           | population here has concerns about this.
        
           | rcarr wrote:
           | +1, I would also like to read it
        
         | rcarr wrote:
         | It's sad if the only way voice actors are going to be able to
         | make a living is by doing stuff like Critical Role on Youtube.
         | I love Critical Role but it likely wouldn't be the same if
         | those guys hadn't spent years honing their craft. Watching
         | people play RPGs online has replaced a lot of my streaming
         | viewing now, but the market is much smaller and I imagine it
         | can only sustain a much smaller pool of creatives than the
         | current voice over market can.
        
         | dylan604 wrote:
         | > and their voicework is owned by their clients who may not be
         | as incentivised in protecting the VO.
         | 
         | The work product produced by their voice for fulfilling the
         | contract is owned. No corp owns someone else's voice.
        
           | minimaxir wrote:
           | They don't own the _voice_ , but they own the _vocal
           | performance_ , which ends up being a meaningless legal
           | distinction in practice.
           | 
           | It's one reason why VAs rarely take fan requests for a
           | character they voice.
        
             | dylan604 wrote:
             | If they are using their real voice, then they kind of
             | screwed themselves. If they are performing a character
             | voice, then at least they only lose out on that kind of
             | work.
             | 
             | I'm guessing contracts will need to be updated to say that
             | a character's voice made from AI can't be used so a
             | completely different production cannot say they have the
             | actor attached for publicity purposes.
        
           | rockemsockem wrote:
           | No one owns a voice at the moment. There is no mechanism in
           | the US to own a voice, even your own.
        
             | leni536 wrote:
             | A person's voice is effectively owned by the corresponding
             | person through right of publicity, which includes voice
             | depending on jurisdiction.
             | 
             | California, for example:
             | 
             | "Any person who knowingly uses another's name, _voice_ ,
             | signature, photograph, or likeness, in any manner, on or in
             | products, merchandise, or goods, or for purposes of
             | advertising or selling, or soliciting purchases of,
             | products, merchandise, goods or services, without such
             | person's prior consent, or, in the case of a minor, the
             | prior consent of his parent or legal guardian, shall be
             | liable for any damages sustained by the person or persons
             | injured as a result thereof."
             | 
             | https://leginfo.legislature.ca.gov/faces/codes_displaySecti
             | o....
        
           | Jeff_Brown wrote:
           | Porperty is a bundle of rights, and often hard to pin down.
           | In the case of voices, if a company owns enough of your data
           | to train a good simulacrum, and they have the right to do it,
           | then they kind of do own your voice -- or more precisely, a
           | damn good substitute.
        
             | minimaxir wrote:
             | Case in point, Luke Skywalker / Darth Vader in the D+
             | series: https://www.vanityfair.com/hollywood/2022/09/darth-
             | vaders-vo...
             | 
             | > Belyaev is a 29-year-old synthetic-speech artist at the
             | Ukrainian start-up Respeecher, which uses archival
             | recordings and a proprietary A.I. algorithm to create new
             | dialogue with the voices of performers from long ago. The
             | company worked with Lucasfilm to generate the voice of a
             | young Luke Skywalker for Disney+'s The Book of Boba Fett,
             | and the recent Obi-Wan Kenobi series tasked them with
             | making Darth Vader sound like James Earl Jones's dark side
             | villain from 45 years ago, now that Jones's voice has
             | altered with age and he has stepped back from the role.
        
             | bbarnett wrote:
             | Copyright is complex. And artist's rights are outside of
             | copyright, in some respects. An example.. in the past,
             | painters have had their works bought, and then hung in
             | unfavourable conditions. Or in places/locations, which
             | reflect poorly upon the work of art.
             | 
             | Artists have sued, and won, to have artwork moved, shown
             | differently, or force-sold back to the artist.
             | 
             | Now, everything you say is copyright... you. At least in my
             | legal jurisdiction! Even my image is, in Quebec! Yes, that
             | includes if you take my picture outside.
             | 
             | So what of one's voice? And if you don't have a real
             | agreement, to use that voice in any way desired. And then
             | you use that voice to.. I don't know, advocate for
             | terrorists or something weird.
             | 
             | What then?
             | 
             | I don't think it's completely clearcut, and I think there
             | will be changes, decisions on this going down the road.
        
               | dylan604 wrote:
               | I pay little attention to SAG contracts, but after the
               | Writer's Guild strike, I'd be expecting SAG to follow
               | suit with major asks to protect its members from AI if
               | they have not already covered it.
        
               | minimaxir wrote:
               | The current standard is the NAVA AI Rider:
               | https://navavoices.org/2023/01/23/artificial-
               | intelligence-ri...
               | 
               | NAVA also has guidelines for protection against AI abuse:
               | https://navavoices.org/synth-ai/
        
               | dylan604 wrote:
               | thanks. i have recently been asked by a couple of
               | acquaintances that have done a few character voices in
               | the past what I thought on AI and what can really be done
               | with it. because of their infrequent performances, they
               | aren't union members, but I'll pass along these links.
        
               | autoexec wrote:
               | > Artists have sued, and won, to have artwork moved,
               | shown differently, or force-sold back to the artist.
               | 
               | That seems insane to me. Do you have specific examples?
        
               | rendx wrote:
               | https://en.wikipedia.org/wiki/Moral_rights
               | 
               | "Independent of the author's economic rights, and even
               | after the transfer of the said rights, the author shall
               | have the right to claim authorship of the work and to
               | object to any distortion, modification of, or other
               | derogatory action in relation to the said work, which
               | would be prejudicial to the author's honor or
               | reputation."
               | 
               | "The authors of dramatic works (plays, etc.) also have
               | the right to authorize the public performance of their
               | works (Article 11, Berne Convention).
               | 
               | https://en.wikipedia.org/wiki/Authors%27_rights
               | 
               | The protection of the moral rights of an author is based
               | on the view that a creative work is in some way an
               | expression of the author's personality: the moral rights
               | are therefore personal to the author and cannot be
               | transferred to another person except by testament when
               | the author dies."
               | 
               | I bet my voice is mine under most jurisdictions.
        
         | raytopia wrote:
         | I'd be interested.
         | 
         | Most likey you'd see a lot of people saying that somehow
         | getting rid of voice actors is good for "progress". Whatever
         | that means.
         | 
         | Random aside someone really needs to make a hackernews that
         | focuses more on game development and other arts so blog posts
         | like your talking about would have a proper community to
         | discuss them with.
        
           | Legend2440 wrote:
           | Replacing voice actors with text-to-speech is good because it
           | lets you do things voice actors can't:
           | 
           | * Create dynamic new voice lines at runtime, for example game
           | characters reacting to new situations.
           | 
           | * Operate at a scale that's infeasible for humans, for
           | example turning every ebook into an audiobook.
        
             | JohnFen wrote:
             | Which are, in my view, really minor advantages when
             | compared to the disadvantages. Not only in terms of putting
             | people out of work, but in terms of increasing the artifice
             | of the world around us and decreasing its humanity.
        
               | Legend2440 wrote:
               | "putting people out of work" by automating jobs is also a
               | good thing.
               | 
               | The amount of stuff humans can accomplish is strongly
               | limited by the supply of workers. Automating one job
               | frees them up to do other things.
        
       | mito88 wrote:
       | "celebrity voices impersonated"
       | 
       | Watch Light My Fire on YouTube Music
       | https://music.youtube.com/watch?v=lN3v3EfA6_A&si=_hcG3Wjakxd...
        
       | causi wrote:
       | AI song covers are incredible, from Goku singing "Don't Stop Me
       | Now" to the cast of Spongebob singing "Ocean Man".
        
         | lostlogin wrote:
         | https://m.youtube.com/watch?v=XzqbhDqAEtw
        
           | cm2012 wrote:
           | Would have strongly preferred DBZA goku :)
        
         | ssalka wrote:
         | My favorite is the Mr. Krabs cover of "Billie Jean"
         | 
         | https://www.youtube.com/watch?v=CkQ-44PvTs8
        
           | all2 wrote:
           | This is actually good. Hysterically so.
        
           | civilitty wrote:
           | Mr Krabs rapping Lose Yourself by Eminem [1] is all the
           | evidence I've ever needed that Clancy Brown should have been
           | a rapper.
           | 
           | [1] https://www.youtube.com/watch?v=d7N6jOziN4E
        
             | tetris11 wrote:
             | Those high notes really kind of overflow to baritone there
        
       | simonw wrote:
       | I did not know about this: "The center of the A.I. cover songs
       | community is a massive 500,000+ member Discord called A.I. Hub,
       | where members trade new tips, tools, techniques, and links to
       | their original and cover songs."
        
         | jrm4 wrote:
         | I poked around there for a while, and my takeaway was "sub-par"
         | all around, which might be the reason for it's relative
         | obscurity? The thing is, I can't tell to what extent it's the
         | tech, and to what extent it's just "very uninteresting source
         | material."
         | 
         | Like, there's a whole lot of "classic song done by presently
         | popular rapper," and I'll be the first to insist that there is
         | nearly nothing vocally interesting at all coming from _todays_
         | popular hip-hop artists (and I say this as an extreme long-time
         | hip-hop aficionado)
        
           | [deleted]
        
         | joenot443 wrote:
         | Something I think we're slowly coming to terms with is that the
         | current generation of techies (the ones who can afford to spend
         | hours upon hours tweaking models and sharing results) really
         | prefer Discord over our Web 2.0 forum type communities like
         | this one. Even reddit on, which is lagging in popularity
         | amongst Gen-Z when compared to Discord or TikTok, you can
         | immediately tell upon reading /r/LocalLLMs that a really big
         | chunk of this community are underaged. To be clear, I think
         | this is a good thing!
         | 
         | There was a generation that preferred mailing lists. There was
         | a generation that preferred IRC and BBS, and "my" generation
         | which likes forums and lengthy comment threads. One would be
         | naiive to think this style (the one we're engaging in here)
         | would last forever.
         | 
         | There are definitely very real criticisms of Discord,
         | searchability and discoverability being the most common, but at
         | this point I think the die has been cast. Young people have
         | made their choice.
        
           | BandButcher wrote:
           | Agree, im in my early 30s and jump through most platforms,
           | but very little with tiktok/discord. but i have to admit a
           | lot of newer content (and tech framework support) has
           | migrated to discord channels. Even some YouTube sports talk
           | shows have their own discord for call ins, etc...
           | 
           | These big teleconference apps are usually hit or miss but
           | discord seems to be the winner currently for actual "social
           | networking", also add in its trend in the gaming community
        
         | codetrotter wrote:
         | Me neither. That's what's so weird about the internet.
         | 
         | Imagine half a million people out in the streets together.
         | You'd definitely notice that. Meanwhile, we can have these
         | massive online communities and you'd never know unless you
         | accidentally stumbled across it or someone told you about it.
        
           | evan_ wrote:
           | more accurate to say that, while 500,000 people joined the
           | discord by clicking a link, some much, much smaller number
           | are actually active on any sort of a regular basis
        
             | thomastjeffery wrote:
             | So to continue the analogy comparison, 500,000 people
             | walked in that street at some point. Some unknown
             | percentage of that number is made of unrecognized
             | duplicates (same person new username).
        
             | dylan604 wrote:
             | this sounds like the description of most "new" social
             | platforms. we see immediate interest, and then a sudden
             | loss of that interest
        
             | LordDragonfang wrote:
             | Yeah, one of the "worst" (good for metrics, bad for
             | legibility) parts of the trend of moving to discord for any
             | sort of online community is that you have to "join" the
             | community to even _view_ any of the resources ensconced
             | within. Meaning it 's poorly indexed (discord search is
             | okay, but not great) and not available at all to external
             | crawlers.
        
               | throwaway290 wrote:
               | If this community was available for crawling then LLM
               | would crawl it and there would be no value in
               | participating in the community because you can just ask
               | the LLM about all that, no?
        
               | LordDragonfang wrote:
               | If the value your community provides is low enough that
               | it can be effectively replaced by a general purpose LLM,
               | then it should be. The value of a _community_ should be
               | pushing the boundaries of knowledge, not gatekeeping it.
               | 
               | C'mon, this is _hacker_ news, what happened to
               | "information should be free"?
        
               | buildbot wrote:
               | The GP comment author is someone who refers to OpenAI as
               | "ClosedAI" (which to me speaks to somewhat low level of
               | emotional maturity...), and seems to generally want
               | information to be the least free as possible.
        
         | [deleted]
        
       ___________________________________________________________________
       (page generated 2023-10-02 23:00 UTC)