Post B2eWGbjV7V2aBIvZ20 by icedquinn@blob.cat
 (DIR) More posts by icedquinn@blob.cat
 (DIR) Post #B2eWGbjV7V2aBIvZ20 by icedquinn@blob.cat
       2026-01-25T03:36:30.633165Z
       
       0 likes, 0 repeats
       
       probably gonna take down the ambercast board. it was fun to set up but two weeks and nobody has listened to the book nor posted anything.y'all got like infinite trump drama and distant corposhit to keep youselves entertained by in perpetuity, no real point in me trying to run anything else.
       
 (DIR) Post #B2eWGcyQVUZe1suzw0 by raintrees@noauthority.social
       2026-01-25T04:57:48Z
       
       0 likes, 0 repeats
       
       @icedquinn what is your ambercast board?  If you care to share with a stranger...
       
 (DIR) Post #B2eWGeGtgIwW3SZGMa by icedquinn@blob.cat
       2026-01-25T04:59:45.646841Z
       
       0 likes, 0 repeats
       
       @raintrees i was using chatterbox and some hand editing to turn the chronicles of amber in to a tts podcast
       
 (DIR) Post #B2eWGfACMhB8oyRQYq by raintrees@noauthority.social
       2026-01-25T05:06:26Z
       
       0 likes, 0 repeats
       
       @icedquinn cool, I like Zelazny :)How does chatterbox perform?  Last time I messed around with text to speech (it has been more than a year, maybe more than that) I was less than impressed.And lately I am beginning to find AI reads more and more annoying, all the wrong emphasis in sentences...You have this up somewhere?
       
 (DIR) Post #B2eWGfsVhwcl2PAod6 by icedquinn@blob.cat
       2026-01-25T05:08:23.940680Z
       
       0 likes, 0 repeats
       
       @raintrees its slow (maybe 2x realtime on cpu) but the quality is very acceptable.https://pillowfort.iceworks.cc/amber/it still mutters randomly at times or has funny emphasis, but i had someone tell me they didn't even clock it as AI so :youmusip:
       
 (DIR) Post #B2eWGgtbvL6AC6hCz2 by raintrees@noauthority.social
       2026-01-25T05:14:10Z
       
       0 likes, 0 repeats
       
       @icedquinn I wonder if it would be worth it to come up with a markdown dialect to have tts feel closer to normal human speech patterns, in regards to emphasis?how were the different voices triggered for the different characters?  Was that part of your hand-editing?
       
 (DIR) Post #B2eWGhczCdOWSpvRi4 by icedquinn@blob.cat
       2026-01-25T05:18:05.450639Z
       
       0 likes, 0 repeats
       
       @raintrees chatterbox's turbo model supports a very small number of triggers like [cough] or [laugh]. otherwise there WAS a standard called SSML or something, that was once supported by commercial TTS, that was exactly some xml system for being able to transmit desired phonemes and parameter changes. Neural models haven't been trained to respect them because as i say, why do anything right :comfystoner: its all about chasing that benchmark number and who cares about usability, interpretability, customization, etc, just get that WER score gooder. :ablobcatgooglytenor: anyway i have to copy paste the manuscript in to a text file and then reformat it to fountain (a screenplay markup) and a python script uses a fountain parser to pair characters with parameters for chatterbox, and writes those as json files, which i then use ninja build to "compile" to audio via curling the json file to the server.one of god's goofiest uses of ninja build but it absolutely works and it got me out of having to actually learn dagu or utask :cirno_baka:
       
 (DIR) Post #B2eWGiY3mR33Jqd1fc by raintrees@noauthority.social
       2026-01-25T05:19:20Z
       
       0 likes, 0 repeats
       
       @icedquinn my hat is off to you, that is a bit of rigging :)
       
 (DIR) Post #B2eWGju4k4FjWPw7cm by icedquinn@blob.cat
       2026-01-25T05:22:41.662232Z
       
       0 likes, 0 repeats
       
       @raintrees surprising amount of work actually goes in to it. esp. sourcing the voice samples, cutting them down to the 30sec limit, running them through resemble enhance (removes noise, repairs compression damage), then through matchering (masters the samples to a reference)at one point the hale appleman clone was from like him in a bathrobe and it made the narration sound super anxious like being forced to read on a toilet, but the same sample is what it is now after discovering i could indeed run the cleaning and mastering models on cpu.i wouldn't use this crap for just any book but someone was really hyping amber at the time.
       
 (DIR) Post #B2eWGkck3zyvkwpnFI by raintrees@noauthority.social
       2026-01-25T05:29:56Z
       
       0 likes, 0 repeats
       
       @icedquinn Again, kudos - I have a hard enough time just wrangling ffmpeg to slice up audio books into chapters :)
       
 (DIR) Post #B2egEZiWdtlitN8mpM by icedquinn@blob.cat
       2026-01-25T05:31:12.481048Z
       
       0 likes, 0 repeats
       
       @raintrees i have a fish alias for that from when i used to watch downloaded youtube more than i docommand ffmpeg -i $argv[1] -c:v copy -c:a copy -f segment -segment_time 30:00 -reset_timestamps 1 $argv[2]although it won't intelligently seek chapter headings or even silent parts.
       
 (DIR) Post #B2egEb71SIxTDdbreK by raintrees@noauthority.social
       2026-01-25T05:33:29Z
       
       0 likes, 0 repeats
       
       @icedquinn thank you - copied to a text file for later play...I am about to crash, I was up and down ladders hanging security cameras today, time for some shut eye.Cheers!
       
 (DIR) Post #B2elJnO3kG6IVVsPJY by Ree@shitposter.world
       2026-01-25T04:35:26.442867Z
       
       0 likes, 1 repeats
       
       @icedquinn I didn't know you made a thing