[HN Gopher] Audapolis: Edit audio files by transcript, not waveform
___________________________________________________________________
Audapolis: Edit audio files by transcript, not waveform
Author : mavsman
Score : 162 points
Date : 2024-07-22 16:25 UTC (6 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| petarb wrote:
| This is awesome to see as an open source project.
|
| This functionality is some of my favorite when editing videos in
| Descript. It's so much easier than chopping up waveforms in
| Audacity
| iainctduncan wrote:
| IMHO you should really change the headline on this. I'm an audio
| person, and my first thought was "that's stupid, words are awful
| at describing sound". But then I looked, and editing
| _transcriptions of voice recordings_ by word is actually a great
| idea. That was not the impression the headline gave me, FWIW!
| kajecounterhack wrote:
| I'm also an audio person and I understood it just fine, so
| whatevs.
| iainctduncan wrote:
| If you're trying to get attention, copy should be clear to
| all readers. The fact that you did not misread it in no way
| demonstrates that others won't.
|
| And why the rude response?
| IshKebab wrote:
| I also understood it fine, but maybe we both just remember
| the Adobe demo that vunderba mentioned. I guess it might
| not be so obvious if you don't know about that?
|
| On the other hand it does say "not waveform" which I think
| makes it pretty clear. What would you suggest instead?
| kajecounterhack wrote:
| You're being insecure. It's not rude to disagree.
|
| Also, there's often no perfect combo of words, there's a
| spectrum of options and you just pick an operating point.
| Transcription is a longer word than "word" so there's a
| tradeoff. It doesn't feel like a chasm to me.
| iainctduncan wrote:
| Uh, no I'm not. In my work world, disagreeing with
| "whatevs" would be considered rude and dismissive and
| would be called out.
|
| Believe me, I don't care that you disagree. I just don't
| like to see people breaking the civility guidelines here
| as it's just about one of the last places online where
| discourse is largely held to a a civil level for
| disagreements.
|
| I write copy professionally, among other things. If you
| don't care whether what you write is clear to almost all
| readers... then I suppose it doesn't matter. Most people
| do not want misunderstandings of their copy and most copy
| editors would flag that as unclear. The new version is
| much better.
| kajecounterhack wrote:
| > I just don't like to see people breaking the civility
| guidelines here as it's just about one of the last places
| online where discourse is largely held to a a civil level
| for disagreements.
|
| I seriously disagree that this breaks any sort of social
| contract between you and I on the internet. It was
| intended to be mildly dismissive but not overly rude.
| There's a higher standard for communicating with care at
| work (you should care about your coworkers), but do you
| really think people on the internet have time for this
| shit? I don't know you guy.
| dialsMavis wrote:
| I'm genuinely curious what you were trying to convey by
| completing your, totally valid, disagreement with "so
| whatevs"? I believe this is the part that's perceived as
| rude because the expansion of that, "whatever", is often
| further expanded as the sarcastic form of "whatever you
| say".
| kajecounterhack wrote:
| I was going for something between YMMW and "whatever you
| say." The slight tilt toward the latter was received
| poorly -\\_(tsu)_/- Maybe it's generational.
| RockRobotRock wrote:
| To me (not an audio person), it was pretty obvious that the
| headline meant editing voice recordings.
| iainctduncan wrote:
| It's not at all obvious. Given what we have seen recently, an
| equally plausible interpretation is "talk to an LLM and it
| will edit your audio" where audio could be anything.
|
| It's not a good idea, but then tons of the LLM ideas we see
| here aren't either.
| dang wrote:
| What would be a clearer title?
| TheRealPomax wrote:
| "[...] by transcript, not waveform".
| dang wrote:
| Done. Thanks!
| iainctduncan wrote:
| WAY better!
| jiehong wrote:
| That's awesome!
|
| Is 1 emoji for each commit title a new trend?
| larrybolt wrote:
| I'm not sure how new the trend is, but it's called gitmoji
| (https://gitmoji.dev/) and there's also tooling to make
| committing/searching for the "correct" emoji easier :D Whatever
| makes your job more fun, right? Oh and it saves on characters!
| DJiK wrote:
| Gitmoji has been around for eight years now.
| https://gitmoji.dev/
| alsetmusic wrote:
| One of the hosts of a podcast that I listen to has had positive
| things to say about DeScript.[0] Just mentioning it because he's
| been talking about it for a few years so I expect its had a good
| amount of feature development over time.
|
| [0] descript.com/
| mavsman wrote:
| I love Descript. Their "convert to studio quality" feature is
| better than Adobe's and ElevenLabs, in my experience.
|
| I wondered if this particular feature was really worth paying
| for so I was happy that I found Audapolis.
| pimlottc wrote:
| What does that feature do?
| geekodour wrote:
| this looks great! will try out. I built a similar but very
| scrappy tool for the same usecase last year, I'd probably not
| build it if i found this.
|
| [0] https://github.com/geekodour/wscribe-editor
| hammeiam wrote:
| I've spent some of my free time over the past couple of months
| working on something similar. It's in a decent state but I need
| help from somebody who understands the .fcpxml format so you can
| export your edits to Davinci and FCP.
|
| Take a look at https://matcha.video
| emadda wrote:
| Nice, are there plans to notarize the mac app?
|
| I built something similar here: https://bigwav.app
| vunderba wrote:
| I remember when Adobe demoed this idea of being able to edit
| waveforms by the recognized text back in 2016 and it was pretty
| mind blowing for the time.
|
| https://youtu.be/I3l4XLZ59iw
|
| _EDIT: I could also definitely see Audapolis being useful if you
| could integrate it into a podcast 's post processing flow (volume
| normalization, de-essing) by recognizing certain verbal tics and
| automatically removing them from the audio such as "ummmm...",
| etc._
| Philip-J-Fry wrote:
| What ever happened to that Adobe demo? Was that a real product
| at any point? It's quite amazing how ahead of its time it was.
| Now that we have AI making people say whatever we want, it felt
| like Adobe was on the cusp of that then.
| codetrotter wrote:
| I remember people saying at the time that "this is the point
| at which voice recordings can not be trusted any longer". And
| then, like you said nothing happened kind of for a few years
| until the current AI/ML tech got to where it is currently at.
| jazzyjackson wrote:
| and there's still no commercial product for synthesizing
| video to sync lip movements to edited transcript like all
| the scary proof of concepts that turned the president into
| a puppet
|
| Maybe there's not much value in editing what someone said
| after all
| lofaszvanitt wrote:
| They were probably strongarmed to give it up. Few years
| later, with the AI craze... everything is ok.
| leetrout wrote:
| Hindenburg also added this capability.
|
| > Hindenburg's manuscript feature gives you a complete overview
| of your audio. You can select the text just as you would in a
| text document and watch as your edits are made in real-time. If
| you need to export your text in a specific format, no problem.
| Hindenburg supports the most common text and transcription export
| formats.
|
| https://hindenburg.com/
| pryelluw wrote:
| If the maintainer is reading, having a demo video would be nice.
| MForster wrote:
| And here I was expecting that I could edit the text and the app
| would change the audio file to say what I had typed...
| MikeTheGreat wrote:
| Can I ask what this tool does? I was trying to figure it out
| (the GitHub page isn't terribly clear) and came to the same
| conclusion you did (delete a chunk of the transcript and the
| tool would delete that audio).
|
| I think I just lack experience in this area. I've used Audacity
| to cut out parts of audio / splice together two clips and
| that's about it, so I clearly don't have enough background to
| understand what this tool does.
|
| Can someone clarify what this tool does, please? :)
| frakkingcylons wrote:
| Somewhat off-topic: I saw the funding note at the bottom - it's
| pretty cool that the German government is giving some funding to
| projects like this. I wonder how much the US is doing in that
| regard, like if there's a list of projects that tax dollars goes
| towards.
| vinniep1 wrote:
| You can find some answers to that here: https://www.nsf.gov/
| bluelightning2k wrote:
| A genuinely free alternative to Descript sounds very useful.
|
| I've always liked the idea of Descript and was considering
| building something similar before it came out. The problem is my
| use case is a couple of videos a year so doesn't fit with an
| expensive monthly subscription
| jdprgm wrote:
| This really needs a video demo or at least a more in depth text
| description of the features. Will download later to try but
| curious does this just do simple hard cuts on audio text or is
| there any ai magic for blending sentence timing if that makes
| sense?
|
| A number of comments turned me onto Descript -- made a similar
| comment on another audio thread recently: drives me absolutely
| insane how all audio tools with any AI are web based monthly saas
| instead of offline private gpu upfront purchase.
| aabhay wrote:
| The web based tools launch and move faster. There's no lack of
| offline tools, if you're the kind of person that files issue
| tickets in their spare time
| raymond_goo wrote:
| Demo Video: https://pajowu.de/audapolis_intro.mp4
___________________________________________________________________
(page generated 2024-07-22 23:03 UTC)