[HN Gopher] Show HN: A fully automated podcast - actually 12 pod...
___________________________________________________________________
Show HN: A fully automated podcast - actually 12 podcasts
"That Horoscope Podcast - Aquarius" and it's eleven siblings - are
daily podcasts that are end-to-end programatically generated e.g.
scripted, voiced, post-produced and uploaded. Would love to get
some first impression feed back and hear how others would achive
the same thing!
Author : holdenc137
Score : 23 points
Date : 2022-05-21 19:23 UTC (3 hours ago)
(HTM) web link (anchor.fm)
(TXT) w3m dump (anchor.fm)
| arrmn wrote:
| Can you elaborate more on the TTS? Did you prerecord fragments
| (how many did you actually do?) and you just stich them together?
| So there is a may.mp3, 22.mp3 and your scripts just puts them
| together?
| holdenc137 wrote:
| Sure.
|
| For dates etc - you got it. I think from memory it would be
| 'Wednesday' + 'the 18th' + 'of' + 'may...' + '20' + '22'
|
| For the narrative speech it would be more words in a file.
| There are plenty of files (EDIT: just checked 350ish files that
| cover all the variations of script that can be generated at the
| moment)
|
| In general the TTS - part of the project is the 'art of the
| almost possible' (if TTS engines sounded really good - I'd have
| just used one of the shelf)
| arrmn wrote:
| How did you come up with the initial list of phrases? Did you
| do some kind of analysis of other horoscopes?
| holdenc137 wrote:
| Listened to a few podcasts, read a few - then tried to come
| up with some combinations that (I hoped) were funny :)
|
| Here's all of the current 'starts' for the main prediction:
|
| (Note they're all pretty non commital - so anything could
| come next)
|
| A bite from a wild animal
|
| A financial matter coming to a head
|
| Completion of a long delayed task
|
| A seemingly generous gesture
|
| A sudden realisation
|
| An agreement with a headstrong peer
|
| An unavoidable slowdown
|
| Being pulled between two emotional options
|
| A sudden eruption of feelings
|
| Investigating a proverbial - light in the woods
|
| Involvement with a purely privte project
|
| Making peace with the past
|
| The chance of a big win
|
| Todays socialising
| suprjami wrote:
| This makes me dread that soon many other podcasts will be
| automated like this, and it'll be orders of magnitude more
| difficult to find good content than it already is.
| holdenc137 wrote:
| Time to start lobbying for a 'generated'=true/false flag on RSS
| feeds?
|
| Also, I promise to only churn out inane content for LOLs.
| mro_name wrote:
| whatever there may be, it's drowned in ad- and spyware.
| papathunk wrote:
| hmmm - it's on Anchor (spotify owned) - I wonder where the
| spyware is coming from.
| planetsprite wrote:
| Very good idea and execution. I think you could do a lot more
| interesting stuff than making it about horoscopes.
|
| Also as a turing test you should make one and never reveal it's
| entirely automated until it develops a big following. Due to the
| speed of automation you could mass produce podcasts of different
| types until one sticks, then the ones gaining more traction put
| 10x more resources into, etc.
| holdenc137 wrote:
| I like your thinking :) I thought horoscopes (Because they are
| kind of repetitive) was a good first fit - because the scripts
| could be generated from stock fragments 'A chance encounter'
| etc...
|
| I think when the little glitches are ironed out of 'real' TTS -
| we'll be awash with generated content.
| planetsprite wrote:
| I can imagine. Imagine a podcast generation system. You'd
| simply have to describe the personalities of the host(s), the
| topic, the general vibe of the theme music, length, hardcode
| any sponsors, and a GPT-4 powered MLops service could produce
| something liked by a list of demographics 8 times out of 10
| in 5 minutes.
|
| There really is no stopping this train. In 2030 90% of
| internet content adopting the guise of being from the "real
| world" will be entirely generated by machine learning models.
| 90% of conversations you have with strangers online will
| likewise be with bots catered to influence you in subtle ways
| to maximize the return on revenue of your attention.
| holdenc137 wrote:
| In a parallel to email spam-bots, we get personal agents
| which (by your choice) filter out the generated content and
| make sure you only get the real deal.
|
| The fight is real.
| planetsprite wrote:
| This battle between bots and detectors will consume 99%
| of computer resources by then. Total GANnihilation.
| tobr wrote:
| I can't say I understand the point of this, so my only feedback
| is that the date in the episode from Saturday 21st of May is
| announced as "Thursday 21st of April".
| holdenc137 wrote:
| Yeah my bad - trust the human in the loop to put date in wrong.
|
| As to the point, its programming practice, perhaps a stepping
| stone to more elaborate content-generation systems, and jolly
| good fun too.
| planetsprite wrote:
| What vocal synthesis program did you use? Sounds 100% real at
| parts.
| holdenc137 wrote:
| Basically it is real. Because the possible scripts that can be
| generated are known - fragments of speech (eg 3,4,5 word
| phrases) were recorded (so the intonation is free).
|
| Would be great to do it with an off-the-shelf TTS engine but I
| don't think there quite there yet. I know my recording skills
| and microphone technique is rubbish - but if I knew what I was
| doing on that front - I think you'd be really hard pushed to
| tell it was stitched together phrases.
| planetsprite wrote:
| The potential is 100x more with vocal synthesis imo. No need
| to make programmatic mad-libs style formats. Complete
| freedom, even though the quality isn't optimal.
| holdenc137 wrote:
| Totally agree. I think we're probably only a year or so off
| TTS that can put some proper intonation into a sentence -
| hopefully then they'll be indistiguishable from live
| speech.
|
| I've tried to listen to books with today's TTS and it soon
| becomes really grating (To my ears at least). It only needs
| the tinyest slip every few sentences and you can't listen
| any more.
| Li7h wrote:
| You have a text error in the description. A daily horoscope
| podcast for Aquariums. Also the episode for May 22nd narrates the
| date as April 22nd. But I love this concept. Is this an AI speech
| engine or pre recorded snippets? Where did you get the text
| snippets from? Have you thought of incorporating GPT-3 into your
| horoscopes ala co-star?
| holdenc137 wrote:
| The date was my mistake - the 'Aquariums' was for LOLs. (see
| also 'Librarians' and similar)
|
| It's prerecorded snippets that came out of my mouth ;)
| edent wrote:
| Disturbingly accurate in my case. I've seen many arcane things
| today - including matches.
|
| (Which TTS are you using? Or have I misunderstood?)
| papathunk wrote:
| Haha.
|
| Sadly I don't think any (commercial / phoneme based) TTS would
| be very listenable for a podcast. Those are hand rolled
| fragments of speech. ( Think old school Satnavs "In " + "30
| yards " + "turn left"
| number6 wrote:
| So how die you do it?
| [deleted]
| holdenc137 wrote:
| The generation of the 'script' uses a kind of L-System - like
| production rules. There's a big file along the lines of:
|
| [the podcast] = [intro] [main body] [outro]
|
| [main body] = [main prediction] [lucky colours] [alibi] etc
|
| // these rules finally break down to text, eg
|
| [main prediction start] = "A bite from a wild animal" or "A
| chance encounter"SS
|
| So the script has lots of combinations and is semi random - but
| it should always make sense.
| [deleted]
| vanous wrote:
| Interesting! Is the code for the automation available?
| papathunk wrote:
| Will clean it up if there's enough interest. It's in a few
| parts ... 'transcript' generation, then the speech assembly...
| then dropping in the backing track + intro / outro, then
| uploading.
|
| Which bit's of interest?
___________________________________________________________________
(page generated 2022-05-21 23:01 UTC)