[HN Gopher] Audapolis: Edit audio files by transcript, not waveform
       ___________________________________________________________________
        
       Audapolis: Edit audio files by transcript, not waveform
        
       Author : mavsman
       Score  : 162 points
       Date   : 2024-07-22 16:25 UTC (6 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | petarb wrote:
       | This is awesome to see as an open source project.
       | 
       | This functionality is some of my favorite when editing videos in
       | Descript. It's so much easier than chopping up waveforms in
       | Audacity
        
       | iainctduncan wrote:
       | IMHO you should really change the headline on this. I'm an audio
       | person, and my first thought was "that's stupid, words are awful
       | at describing sound". But then I looked, and editing
       | _transcriptions of voice recordings_ by word is actually a great
       | idea. That was not the impression the headline gave me, FWIW!
        
         | kajecounterhack wrote:
         | I'm also an audio person and I understood it just fine, so
         | whatevs.
        
           | iainctduncan wrote:
           | If you're trying to get attention, copy should be clear to
           | all readers. The fact that you did not misread it in no way
           | demonstrates that others won't.
           | 
           | And why the rude response?
        
             | IshKebab wrote:
             | I also understood it fine, but maybe we both just remember
             | the Adobe demo that vunderba mentioned. I guess it might
             | not be so obvious if you don't know about that?
             | 
             | On the other hand it does say "not waveform" which I think
             | makes it pretty clear. What would you suggest instead?
        
             | kajecounterhack wrote:
             | You're being insecure. It's not rude to disagree.
             | 
             | Also, there's often no perfect combo of words, there's a
             | spectrum of options and you just pick an operating point.
             | Transcription is a longer word than "word" so there's a
             | tradeoff. It doesn't feel like a chasm to me.
        
               | iainctduncan wrote:
               | Uh, no I'm not. In my work world, disagreeing with
               | "whatevs" would be considered rude and dismissive and
               | would be called out.
               | 
               | Believe me, I don't care that you disagree. I just don't
               | like to see people breaking the civility guidelines here
               | as it's just about one of the last places online where
               | discourse is largely held to a a civil level for
               | disagreements.
               | 
               | I write copy professionally, among other things. If you
               | don't care whether what you write is clear to almost all
               | readers... then I suppose it doesn't matter. Most people
               | do not want misunderstandings of their copy and most copy
               | editors would flag that as unclear. The new version is
               | much better.
        
               | kajecounterhack wrote:
               | > I just don't like to see people breaking the civility
               | guidelines here as it's just about one of the last places
               | online where discourse is largely held to a a civil level
               | for disagreements.
               | 
               | I seriously disagree that this breaks any sort of social
               | contract between you and I on the internet. It was
               | intended to be mildly dismissive but not overly rude.
               | There's a higher standard for communicating with care at
               | work (you should care about your coworkers), but do you
               | really think people on the internet have time for this
               | shit? I don't know you guy.
        
               | dialsMavis wrote:
               | I'm genuinely curious what you were trying to convey by
               | completing your, totally valid, disagreement with "so
               | whatevs"? I believe this is the part that's perceived as
               | rude because the expansion of that, "whatever", is often
               | further expanded as the sarcastic form of "whatever you
               | say".
        
               | kajecounterhack wrote:
               | I was going for something between YMMW and "whatever you
               | say." The slight tilt toward the latter was received
               | poorly -\\_(tsu)_/- Maybe it's generational.
        
         | RockRobotRock wrote:
         | To me (not an audio person), it was pretty obvious that the
         | headline meant editing voice recordings.
        
           | iainctduncan wrote:
           | It's not at all obvious. Given what we have seen recently, an
           | equally plausible interpretation is "talk to an LLM and it
           | will edit your audio" where audio could be anything.
           | 
           | It's not a good idea, but then tons of the LLM ideas we see
           | here aren't either.
        
         | dang wrote:
         | What would be a clearer title?
        
           | TheRealPomax wrote:
           | "[...] by transcript, not waveform".
        
             | dang wrote:
             | Done. Thanks!
        
             | iainctduncan wrote:
             | WAY better!
        
       | jiehong wrote:
       | That's awesome!
       | 
       | Is 1 emoji for each commit title a new trend?
        
         | larrybolt wrote:
         | I'm not sure how new the trend is, but it's called gitmoji
         | (https://gitmoji.dev/) and there's also tooling to make
         | committing/searching for the "correct" emoji easier :D Whatever
         | makes your job more fun, right? Oh and it saves on characters!
        
         | DJiK wrote:
         | Gitmoji has been around for eight years now.
         | https://gitmoji.dev/
        
       | alsetmusic wrote:
       | One of the hosts of a podcast that I listen to has had positive
       | things to say about DeScript.[0] Just mentioning it because he's
       | been talking about it for a few years so I expect its had a good
       | amount of feature development over time.
       | 
       | [0] descript.com/
        
         | mavsman wrote:
         | I love Descript. Their "convert to studio quality" feature is
         | better than Adobe's and ElevenLabs, in my experience.
         | 
         | I wondered if this particular feature was really worth paying
         | for so I was happy that I found Audapolis.
        
           | pimlottc wrote:
           | What does that feature do?
        
       | geekodour wrote:
       | this looks great! will try out. I built a similar but very
       | scrappy tool for the same usecase last year, I'd probably not
       | build it if i found this.
       | 
       | [0] https://github.com/geekodour/wscribe-editor
        
       | hammeiam wrote:
       | I've spent some of my free time over the past couple of months
       | working on something similar. It's in a decent state but I need
       | help from somebody who understands the .fcpxml format so you can
       | export your edits to Davinci and FCP.
       | 
       | Take a look at https://matcha.video
        
       | emadda wrote:
       | Nice, are there plans to notarize the mac app?
       | 
       | I built something similar here: https://bigwav.app
        
       | vunderba wrote:
       | I remember when Adobe demoed this idea of being able to edit
       | waveforms by the recognized text back in 2016 and it was pretty
       | mind blowing for the time.
       | 
       | https://youtu.be/I3l4XLZ59iw
       | 
       |  _EDIT: I could also definitely see Audapolis being useful if you
       | could integrate it into a podcast 's post processing flow (volume
       | normalization, de-essing) by recognizing certain verbal tics and
       | automatically removing them from the audio such as "ummmm...",
       | etc._
        
         | Philip-J-Fry wrote:
         | What ever happened to that Adobe demo? Was that a real product
         | at any point? It's quite amazing how ahead of its time it was.
         | Now that we have AI making people say whatever we want, it felt
         | like Adobe was on the cusp of that then.
        
           | codetrotter wrote:
           | I remember people saying at the time that "this is the point
           | at which voice recordings can not be trusted any longer". And
           | then, like you said nothing happened kind of for a few years
           | until the current AI/ML tech got to where it is currently at.
        
             | jazzyjackson wrote:
             | and there's still no commercial product for synthesizing
             | video to sync lip movements to edited transcript like all
             | the scary proof of concepts that turned the president into
             | a puppet
             | 
             | Maybe there's not much value in editing what someone said
             | after all
        
           | lofaszvanitt wrote:
           | They were probably strongarmed to give it up. Few years
           | later, with the AI craze... everything is ok.
        
       | leetrout wrote:
       | Hindenburg also added this capability.
       | 
       | > Hindenburg's manuscript feature gives you a complete overview
       | of your audio. You can select the text just as you would in a
       | text document and watch as your edits are made in real-time. If
       | you need to export your text in a specific format, no problem.
       | Hindenburg supports the most common text and transcription export
       | formats.
       | 
       | https://hindenburg.com/
        
       | pryelluw wrote:
       | If the maintainer is reading, having a demo video would be nice.
        
       | MForster wrote:
       | And here I was expecting that I could edit the text and the app
       | would change the audio file to say what I had typed...
        
         | MikeTheGreat wrote:
         | Can I ask what this tool does? I was trying to figure it out
         | (the GitHub page isn't terribly clear) and came to the same
         | conclusion you did (delete a chunk of the transcript and the
         | tool would delete that audio).
         | 
         | I think I just lack experience in this area. I've used Audacity
         | to cut out parts of audio / splice together two clips and
         | that's about it, so I clearly don't have enough background to
         | understand what this tool does.
         | 
         | Can someone clarify what this tool does, please? :)
        
       | frakkingcylons wrote:
       | Somewhat off-topic: I saw the funding note at the bottom - it's
       | pretty cool that the German government is giving some funding to
       | projects like this. I wonder how much the US is doing in that
       | regard, like if there's a list of projects that tax dollars goes
       | towards.
        
         | vinniep1 wrote:
         | You can find some answers to that here: https://www.nsf.gov/
        
       | bluelightning2k wrote:
       | A genuinely free alternative to Descript sounds very useful.
       | 
       | I've always liked the idea of Descript and was considering
       | building something similar before it came out. The problem is my
       | use case is a couple of videos a year so doesn't fit with an
       | expensive monthly subscription
        
       | jdprgm wrote:
       | This really needs a video demo or at least a more in depth text
       | description of the features. Will download later to try but
       | curious does this just do simple hard cuts on audio text or is
       | there any ai magic for blending sentence timing if that makes
       | sense?
       | 
       | A number of comments turned me onto Descript -- made a similar
       | comment on another audio thread recently: drives me absolutely
       | insane how all audio tools with any AI are web based monthly saas
       | instead of offline private gpu upfront purchase.
        
         | aabhay wrote:
         | The web based tools launch and move faster. There's no lack of
         | offline tools, if you're the kind of person that files issue
         | tickets in their spare time
        
       | raymond_goo wrote:
       | Demo Video: https://pajowu.de/audapolis_intro.mp4
        
       ___________________________________________________________________
       (page generated 2024-07-22 23:03 UTC)