hngopher.com

       [HN Gopher] Show HN: An open-source, self-hostable synced narrat...
       ___________________________________________________________________
        
       Show HN: An open-source, self-hostable synced narration platform
       for ebooks
        
       Hi, I made a thing! This is by far the most work I've ever sunk
       into a side project; I've been working on this thing for over two
       years, and I'm super proud of it, even though there's still a lot
       more to do!  Storyteller is a self-hosted platform for ebooks with
       synced narration. This is basically self-hosted WhisperSync, for
       anyone familiar with that Amazon product.  It's currently made up
       of two self-hostable backend systems and a mobile app for reading
       and listening to the books it produces. Technically it uses an open
       spec, EPUB 3's "Media Overlay", for syncing the narration, but very
       few ebook apps actually support Media Overlays, and even fewer work
       well and have nice interfaces.  The mobile app is available on the
       Apple App Store as "Storyteller Reader", and I plan to release it
       for Android as well early next year.  Anyway, I hope someone finds
       this interesting or useful!
        
       Author : smoores
       Score  : 197 points
       Date   : 2023-12-23 20:11 UTC (1 days ago)
        
 (HTM) web link (smoores.gitlab.io)
 (TXT) w3m dump (smoores.gitlab.io)
        
       | bberenberg wrote:
       | Amazing, I've been wanting something like this for years. If only
       | Libby would integrate this so it could be used with rented books.
       | 
       | It would be great if you could add a link to the app on the App
       | Store.
        
         | ck_one wrote:
         | What's your use case for it?
        
           | bberenberg wrote:
           | Audiobooks while running / cooking / other activity where
           | reading doesn't make sense.
           | 
           | Ebook elsewhere.
        
       | cyberax wrote:
       | You didn't include the link:
       | https://smoores.gitlab.io/storyteller/
       | 
       | Looks super nice, the next step is to build a fully synced
       | ecosystem for book management.
        
       | atmosx wrote:
       | Good job! I'm probably going to use this. Would love to have my
       | collection accessible from mobile. A small "nit". Would be great
       | to have non-docker installation instructions readily available.
        
       | 0x073 wrote:
       | More information would be nice, a link to the iOS app or
       | screenshots or what features the project have.
       | 
       | Is it a ebook/a book library like audiobookshelf with sync or
       | just sync? ( https://www.audiobookshelf.org/ )
        
         | joshstrange wrote:
         | Finding the app wasn't super easy, I do wish they'd link to
         | from the mobile apps page
         | 
         | https://apps.apple.com/us/app/storyteller-reader/id647446772...
        
       | mosselman wrote:
       | Is there a demo of the narration? I couldn't find any
        
         | danparsonson wrote:
         | It doesn't generate narration, it syncs existing audio books
         | with their written counterparts by transcribing the audio.
        
       | roywashere wrote:
       | How does the narration work, is it automatically generated? For a
       | year now I have a long commute and listen to audiobooks. However
       | I find the narration vary wildly in quality and think oftentimes
       | text-to-speech might actually be better
        
         | DecoPerson wrote:
         | > Once we have individual tracks to work with, we begin
         | transcription. This is the most resource intensive part of the
         | process. We rely on the Whisper AI transcription model from
         | OpenAI, via WhisperX. The WhisperX project also uses wave2vec2
         | to provide accurate word-level timestamps, which is important
         | for sentence-level synchronization. The transcription process
         | is fairly standard; the only interesting addition to the
         | process that Storyteller makes is to supply an "initial prompt"
         | to the transcription model, outlining its task as transcribing
         | an audiobook chapter and providing a list of words from the
         | book that don't exist in the English dictionary as hints.
         | 
         | https://smoores.gitlab.io/storyteller/docs/how-it-works/the-...
        
         | tschumacher wrote:
         | You provide an audiobook and an ebook and it syncs them.
        
       | jupiter909 wrote:
       | Looks like an interesting project.
       | 
       | I do highly suggest that a quick intro demo video and/or screen
       | shots of a tool like this would be beneficial to the project.
        
       | grigio wrote:
       | Does it sync the reading progress of the ebook among clients?
        
       | klakierr wrote:
       | This works only for drm-free ebooks and audiobooks?
        
         | NoahKAndrews wrote:
         | That's what the docs say, yes
        
       | sandreas wrote:
       | This is pretty interesting...
       | 
       | I once wrote a similar thing for building a custom LJSPEECH
       | dataset out of ebook/audiobook combinations to synthesize my
       | favorite narrator voices using coqui-tts and the VITS model and
       | make them "publish" books that never came out as audiobook.
       | 
       | It was able to synchronize the book contents to timestamps, split
       | the spoken word in to sentences and create a LJSPEECH datasets
       | out of the combinations. I used aeneas[1], it was a bit finicky
       | to set up, but after a while it even was able to map non-english
       | languages (in my case german) with more than 80% accuracy. Worked
       | out pretty well, the LJSPEECH datasets were good (I still have
       | them here), but the TTS tech was not there yet :-) Maybe it's
       | time to revive this project using newer modelling approaches like
       | XTTS or something...
       | 
       | [1]: https://www.readbeyond.it/aeneas/
        
         | vagrantJin wrote:
         | I've thought about exactly this a few years back but lacked the
         | technical skills to implement it. there are some great books
         | out there as you mentioned, but even worse are great books with
         | mediocre narration/production. eg, A Song of Ice and Fire on
         | Audible is absolutely horrid. The Martian by Andy Weir is
         | fantastic. Can I transplant Will wheaton or Greg Tremblay into
         | GOT? Can I have multiple characters narrated by different
         | voices?
         | 
         | please revisit it if you can.
        
           | monkeywork wrote:
           | IMHO the original narration on The Martian by RC Bray is
           | better than Wil's. I enjoyed Wil's work on Ernest Cline's
           | books but RC Bray and Dennis E. Taylor are (for me) top of
           | the mountain when it comes to SF narration.
        
       | sphars wrote:
       | This is really neat, it's something I hadn't thought about
       | before. I've started listening to audiobooks on my commute, but I
       | read at night. I currently use audiobookshelf[0] to listen to my
       | ebooks, and it has support for ebooks as well. I've added a
       | comment[1] on a discussion if audiobookshelf could read the epubs
       | your took creates.
       | 
       | [0]: https://www.audiobookshelf.org/
       | 
       | [1]:
       | https://github.com/advplyr/audiobookshelf/issues/189#issueco...
        
       | r4victor wrote:
       | Amazing! I've made a similar ebooks-audiobooks aligner years ago:
       | https://github.com/r4victor/syncabook. At that time, I chose to
       | synthesize the text and align two audio sequences because I found
       | texts-alignment approaches (including ML-based ones) too compute-
       | intensive and inadequate for long texts. I see Storyteller works
       | by aligning the texts. Could you give some view on how long it
       | takes to sync a book?
       | 
       | Also, my experience was that audio and text versions are often
       | very different (e.g. the audio having an intro missing from the
       | text). It'd be very interesting to know how well Storyteller
       | handles such cases. Does it require manual audio/text editing or
       | handle the differences automatically?
        
         | NoahKAndrews wrote:
         | The docs say it's usually 1-4 hours depending on the book and
         | the hardware:
         | https://smoores.gitlab.io/storyteller/docs/syncing-books
         | 
         | The docs also have a detailed section about the algorithm that
         | goes into how it auto-handles differences between the audio and
         | the text.
        
           | cyberax wrote:
           | One obvious optimization is to sample the audio file at
           | regular intervals and transcribe only a part of the text.
           | Then just interpolate the locations. This can speed it up by
           | a couple of orders of magnitude.
        
       | joshstrange wrote:
       | This is super cool, I love my audiobook app (Prologue) but this
       | could tempt me away. Looking forward to setting this up and
       | trying it out!
        
       | zachlatta wrote:
       | This looks absolutely incredible, and like something I've been
       | trying to find for years! Thank you so much building this!
        
       | majora2007 wrote:
       | Looks really nice. I wanted to do exactly this with my project
       | Kavita, but have been distracted with other things. I've heard
       | Whisper has great potential and a few of my users have been doing
       | something similar with it.
       | 
       | Look forward to see how this project matures. We need more
       | options in the book reading scene that are self-hosted and not
       | Calibre.
        
       | rpxio wrote:
       | I absolutely love this. However, my wife and kids all read EPUBS
       | on kobo e-readers, so I wish we could somehow sync the last page
       | read from kobo to Storyteller so that we could pick up on
       | audiobook later. I'm not opposed to installing koreader on all of
       | our kobos either if that would be required for syncing... it does
       | look like koreader doesn't support epub3 media overlays, but it
       | does have a sync feature.
        
       | mike986 wrote:
       | Super cool project!
       | 
       | > even though there's still a lot more to do
       | 
       | A few have asked on this thread already, but since you're already
       | using AI to transcibe, it would be super cool if we can use AI to
       | generate audio using TTS
       | 
       | I quit audible (signed up a few times) because there are very few
       | high quality audio book, even those spoke by the authors are bad
       | (most of them are not pro narrator)
       | 
       | A good AI would be amazing, as they never get tired speaking for
       | hours, yet maintaining the same energetic voice, intonation and
       | pace.
        
       | causality0 wrote:
       | Can this function as "Plex for audiobooks"? I don't really have a
       | need for synced books but it would be nice to keep fewer
       | audiobooks on my phone.
        
       | chrisweekly wrote:
       | Awesome! Thanks for sharing and working on this! WhisperSync
       | functionality is a game-changer; it's one of the main reasons I'm
       | able to read so much (switching modalities several times per
       | day). I'd love to see this featureset become ubiquitous instead
       | of being so tightly coupled to proprietary, DRM'd Amazon /
       | Audible.
        
       | snapplebobapple wrote:
       | man.. if someone could hook the creation service into
       | audiobookshelf this could be an extremely potent combination..
        
       | ZunarJ5 wrote:
       | Thank you for your hard work!!
        
       ___________________________________________________________________
       (page generated 2023-12-24 23:00 UTC)