[HN Gopher] Show HN: A fully automated podcast - actually 12 pod...
       ___________________________________________________________________
        
       Show HN: A fully automated podcast - actually 12 podcasts
        
       "That Horoscope Podcast - Aquarius" and it's eleven siblings - are
       daily podcasts that are end-to-end programatically generated e.g.
       scripted, voiced, post-produced and uploaded.  Would love to get
       some first impression feed back and hear how others would achive
       the same thing!
        
       Author : holdenc137
       Score  : 23 points
       Date   : 2022-05-21 19:23 UTC (3 hours ago)
        
 (HTM) web link (anchor.fm)
 (TXT) w3m dump (anchor.fm)
        
       | arrmn wrote:
       | Can you elaborate more on the TTS? Did you prerecord fragments
       | (how many did you actually do?) and you just stich them together?
       | So there is a may.mp3, 22.mp3 and your scripts just puts them
       | together?
        
         | holdenc137 wrote:
         | Sure.
         | 
         | For dates etc - you got it. I think from memory it would be
         | 'Wednesday' + 'the 18th' + 'of' + 'may...' + '20' + '22'
         | 
         | For the narrative speech it would be more words in a file.
         | There are plenty of files (EDIT: just checked 350ish files that
         | cover all the variations of script that can be generated at the
         | moment)
         | 
         | In general the TTS - part of the project is the 'art of the
         | almost possible' (if TTS engines sounded really good - I'd have
         | just used one of the shelf)
        
           | arrmn wrote:
           | How did you come up with the initial list of phrases? Did you
           | do some kind of analysis of other horoscopes?
        
             | holdenc137 wrote:
             | Listened to a few podcasts, read a few - then tried to come
             | up with some combinations that (I hoped) were funny :)
             | 
             | Here's all of the current 'starts' for the main prediction:
             | 
             | (Note they're all pretty non commital - so anything could
             | come next)
             | 
             | A bite from a wild animal
             | 
             | A financial matter coming to a head
             | 
             | Completion of a long delayed task
             | 
             | A seemingly generous gesture
             | 
             | A sudden realisation
             | 
             | An agreement with a headstrong peer
             | 
             | An unavoidable slowdown
             | 
             | Being pulled between two emotional options
             | 
             | A sudden eruption of feelings
             | 
             | Investigating a proverbial - light in the woods
             | 
             | Involvement with a purely privte project
             | 
             | Making peace with the past
             | 
             | The chance of a big win
             | 
             | Todays socialising
        
       | suprjami wrote:
       | This makes me dread that soon many other podcasts will be
       | automated like this, and it'll be orders of magnitude more
       | difficult to find good content than it already is.
        
         | holdenc137 wrote:
         | Time to start lobbying for a 'generated'=true/false flag on RSS
         | feeds?
         | 
         | Also, I promise to only churn out inane content for LOLs.
        
       | mro_name wrote:
       | whatever there may be, it's drowned in ad- and spyware.
        
         | papathunk wrote:
         | hmmm - it's on Anchor (spotify owned) - I wonder where the
         | spyware is coming from.
        
       | planetsprite wrote:
       | Very good idea and execution. I think you could do a lot more
       | interesting stuff than making it about horoscopes.
       | 
       | Also as a turing test you should make one and never reveal it's
       | entirely automated until it develops a big following. Due to the
       | speed of automation you could mass produce podcasts of different
       | types until one sticks, then the ones gaining more traction put
       | 10x more resources into, etc.
        
         | holdenc137 wrote:
         | I like your thinking :) I thought horoscopes (Because they are
         | kind of repetitive) was a good first fit - because the scripts
         | could be generated from stock fragments 'A chance encounter'
         | etc...
         | 
         | I think when the little glitches are ironed out of 'real' TTS -
         | we'll be awash with generated content.
        
           | planetsprite wrote:
           | I can imagine. Imagine a podcast generation system. You'd
           | simply have to describe the personalities of the host(s), the
           | topic, the general vibe of the theme music, length, hardcode
           | any sponsors, and a GPT-4 powered MLops service could produce
           | something liked by a list of demographics 8 times out of 10
           | in 5 minutes.
           | 
           | There really is no stopping this train. In 2030 90% of
           | internet content adopting the guise of being from the "real
           | world" will be entirely generated by machine learning models.
           | 90% of conversations you have with strangers online will
           | likewise be with bots catered to influence you in subtle ways
           | to maximize the return on revenue of your attention.
        
             | holdenc137 wrote:
             | In a parallel to email spam-bots, we get personal agents
             | which (by your choice) filter out the generated content and
             | make sure you only get the real deal.
             | 
             | The fight is real.
        
               | planetsprite wrote:
               | This battle between bots and detectors will consume 99%
               | of computer resources by then. Total GANnihilation.
        
       | tobr wrote:
       | I can't say I understand the point of this, so my only feedback
       | is that the date in the episode from Saturday 21st of May is
       | announced as "Thursday 21st of April".
        
         | holdenc137 wrote:
         | Yeah my bad - trust the human in the loop to put date in wrong.
         | 
         | As to the point, its programming practice, perhaps a stepping
         | stone to more elaborate content-generation systems, and jolly
         | good fun too.
        
       | planetsprite wrote:
       | What vocal synthesis program did you use? Sounds 100% real at
       | parts.
        
         | holdenc137 wrote:
         | Basically it is real. Because the possible scripts that can be
         | generated are known - fragments of speech (eg 3,4,5 word
         | phrases) were recorded (so the intonation is free).
         | 
         | Would be great to do it with an off-the-shelf TTS engine but I
         | don't think there quite there yet. I know my recording skills
         | and microphone technique is rubbish - but if I knew what I was
         | doing on that front - I think you'd be really hard pushed to
         | tell it was stitched together phrases.
        
           | planetsprite wrote:
           | The potential is 100x more with vocal synthesis imo. No need
           | to make programmatic mad-libs style formats. Complete
           | freedom, even though the quality isn't optimal.
        
             | holdenc137 wrote:
             | Totally agree. I think we're probably only a year or so off
             | TTS that can put some proper intonation into a sentence -
             | hopefully then they'll be indistiguishable from live
             | speech.
             | 
             | I've tried to listen to books with today's TTS and it soon
             | becomes really grating (To my ears at least). It only needs
             | the tinyest slip every few sentences and you can't listen
             | any more.
        
       | Li7h wrote:
       | You have a text error in the description. A daily horoscope
       | podcast for Aquariums. Also the episode for May 22nd narrates the
       | date as April 22nd. But I love this concept. Is this an AI speech
       | engine or pre recorded snippets? Where did you get the text
       | snippets from? Have you thought of incorporating GPT-3 into your
       | horoscopes ala co-star?
        
         | holdenc137 wrote:
         | The date was my mistake - the 'Aquariums' was for LOLs. (see
         | also 'Librarians' and similar)
         | 
         | It's prerecorded snippets that came out of my mouth ;)
        
       | edent wrote:
       | Disturbingly accurate in my case. I've seen many arcane things
       | today - including matches.
       | 
       | (Which TTS are you using? Or have I misunderstood?)
        
         | papathunk wrote:
         | Haha.
         | 
         | Sadly I don't think any (commercial / phoneme based) TTS would
         | be very listenable for a podcast. Those are hand rolled
         | fragments of speech. ( Think old school Satnavs "In " + "30
         | yards " + "turn left"
        
       | number6 wrote:
       | So how die you do it?
        
         | [deleted]
        
         | holdenc137 wrote:
         | The generation of the 'script' uses a kind of L-System - like
         | production rules. There's a big file along the lines of:
         | 
         | [the podcast] = [intro] [main body] [outro]
         | 
         | [main body] = [main prediction] [lucky colours] [alibi] etc
         | 
         | // these rules finally break down to text, eg
         | 
         | [main prediction start] = "A bite from a wild animal" or "A
         | chance encounter"SS
         | 
         | So the script has lots of combinations and is semi random - but
         | it should always make sense.
        
         | [deleted]
        
       | vanous wrote:
       | Interesting! Is the code for the automation available?
        
         | papathunk wrote:
         | Will clean it up if there's enough interest. It's in a few
         | parts ... 'transcript' generation, then the speech assembly...
         | then dropping in the backing track + intro / outro, then
         | uploading.
         | 
         | Which bit's of interest?
        
       ___________________________________________________________________
       (page generated 2022-05-21 23:01 UTC)