[HN Gopher] Show HN: Recut automatically removes silence from vi...
___________________________________________________________________
Show HN: Recut automatically removes silence from videos. It's
built with Tauri
Author : dceddia
Score : 161 points
Date : 2022-06-16 16:19 UTC (6 hours ago)
(HTM) web link (getrecut.com)
(TXT) w3m dump (getrecut.com)
| amccloud wrote:
| nice relaunch, i've been following recut for a while. I was once
| working on a web version of this all before I discovered recut
|
| https://beta.jumpcutter.pro/
| dceddia wrote:
| Thanks! That's cool. It's a good order of magnitude better than
| my first stab at this, which was a Node script + ffmpeg that
| spit out an EDL file, haha.
| mjwhansen wrote:
| Super cool!
| dceddia wrote:
| I released a new version of Recut recently, rewritten from the
| ground up using Rust, Svelte, Tauri, TypeScript, and Tailwind
| (RUSTTT stack for the win!). It's the first app I've built with
| Tauri and I've really enjoyed it.
|
| Some back story: Recut is a tool I built to speed up my
| screencast editing workflow. It's like a lightweight single-
| purpose video editor. It chops out the pauses, with some knobs to
| tweak how closely it cuts and what it leaves in, and lets you get
| a live preview of what it'll look and sound like with the cuts
| applied. It can then export to a handful of other editors,
| nondestructively, so that you can use the full capabilities of a
| "real" video editor.
|
| It was originally a native Mac app written in Swift, and people
| kept asking for a Windows version. I had learned Swift and macOS
| development to build it originally. So as a solo developer, I had
| some choices to make. Keep it Mac-only? Learn _another_ whole
| language + UI framework, rebuild the app, and maintain two
| codebases? Rebuild the app with a cross-platform toolkit?
|
| I'd had experience with Qt and C++ in years past, but I honestly
| didn't love the idea of getting back into C++ and dealing with
| the inevitable hard-to-debug segfaults. I'd had more recent
| experience as a web developer, but I was worried about
| performance bottlenecks. I actually started down the path of
| building Recut in Electron and Rust (using NAPI-RS for bindings)
| and it looked promising, but I was still worried about the bloat
| of Electron.
|
| A few months in, I took a closer look at Tauri, and ported the
| whole app from Electron in a week or so. Most of the heavy
| lifting was already in Rust, and the UI stuff pretty much "just
| worked". The biggest change was the bindings between JS and Rust.
|
| Working with Tauri has been nice. I especially like their "State"
| system, which gives you an easy way to keep app-wide state on the
| Rust side, and inject specific parts of it into functions as-
| needed. I also really like how easy it is to write a Rust
| function and expose it to JS. The process model feels a lot
| easier to work with compared to Electron's split between renderer
| and main and preload, where you have to pay the cost of passing
| messages between them lest you ruin the security. Tauri's
| message-passing has a decent amount of overhead too, but I dealt
| with that by avoiding sending large amounts of data between JS
| <-> Rust and it's been fine.
|
| The Tauri folks on Discord were a big help too (shout out to
| Fabian for the help when I ran into weird edge cases). I think
| Tauri has a bright future! Definitely worth a look if you know
| web tech and want to make cross-platform apps.
| AlchemistCamp wrote:
| Thank you for sharing so much of this development process over
| the years. It's been fascinating to watch from a distance and I
| have to say, Tauri looks more and more appealing.
| dceddia wrote:
| Thanks! And yeah Tauri is pretty nice, and seems to be
| rapidly improving.
| metadaemon wrote:
| Did you notice any CPU/Memory advantages in switching to Tauri
| over Electron?
| dceddia wrote:
| Memory usage seems to be lower with Tauri but I don't have
| any hard numbers. CPU is better largely because of finding a
| way to draw video frames that involves less copying. It's a
| (frankly, messy) hack that uses a native window/view
| positioned in the right place that I can render directly to
| as a GPU surface, which mayyybe could've been done with
| Electron with a lot of mucking around in Chromium internals,
| but Tauri makes it much easier to access platform APIs.
|
| I know for certain that there are still performance gains to
| be had, but I'm also confident that they're 100% in my
| control - Tauri and the WebView aren't the bottleneck. That
| was one of my big fears with Electron - what if something is
| slow as molasses and it's just stuck that way? I haven't run
| into a wall like that with Tauri yet and at this point I
| don't expect I will.
| nu11ptr wrote:
| The main website video still shows what looks like a native
| Macos app. Any pictures/video of the new version? What UI
| toolkit (React?) and UI library (Material UI?) did you end up
| using with Tauri? Curious because this is something I'll be
| embarking on soon.
| dceddia wrote:
| I need to update some screenshots! It looks very similar
| though, I tried to mimic the UI pretty closely. This video
| shows a demo on Windows: https://youtu.be/wuy-LKSE3y0
|
| I'm using Svelte and Tailwind (not Tailwind UI) so the UI is
| custom, plus or minus some of Tailwind's defaults.
| nu11ptr wrote:
| Nice - I have no UI skills so will definitely use an
| existing lib. I know my limits. :-)
|
| Did you notice any diff in UI speed with view being in
| JS/HTML/CSS? Response time seems nice in your video so I'm
| guessing negligible?
| dceddia wrote:
| Nope, it's fine, and maybe even faster actually. The
| slowest part is drawing the waveform and the (potentially
| thousands of) red silent areas on a <canvas>, and the
| Tauri app does better than the Swift app did there. The
| 2D canvas is GPU-accelerated so it's pretty snappy even
| without doing a bunch of optimizing with dirty rectangles
| etc, whereas the Swift NSView isn't hardware accelerated
| so it required a lot more hand optimizing and I think
| it's still not ideal.
|
| The big thing is to avoid blocking the UI thread, so if
| you're calling into a Rust function that could take more
| than a couple milliseconds, marking that Rust function
| (aka Tauri command) as async will run it in a background
| thread and the UI won't hiccup.
| nu11ptr wrote:
| Thanks for the feedback - this is helpful
| faitswulff wrote:
| What are your thoughts on Swift vs Rust? I've used Rust and
| another engineer showed me guard let statements from Swift,
| which gave me the impression they had some of the same
| sensibilities.
| tylerc230 wrote:
| They share a lot of similarities. Both make it hard to do
| unsafe things, both have functional influences, both have
| modern features like closures, optionals etc. I'd say the
| biggest philosophical difference between the two is that
| Swift leans more toward developer ergonomics while rust is
| geared toward system level programming (ie tighter control
| over memory etc).
| dceddia wrote:
| Yeah they have some syntactical similarities, like `if let`,
| and no parentheses with if's and for's. I really miss `guard
| let` in Rust! (I've heard it's coming at some point though?)
|
| Swift leans _real_ hard into verbose method names, just like
| Objective C did, and Rust is pretty much the exact opposite
| there. When I was writing more Swift I got used to it, and
| actually started to like it. And now that I 'm in Rust a lot,
| I like the brevity.
|
| I think Rust nudges (forces?) me to write code that's
| architecturally better, with fewer interdependecies. It was
| hard to get used to though. In the beginning I kept trying to
| make structs that started threads, where the thread called
| methods on the struct, and that was just a recipe for big
| pain.
|
| Swift, on the other hand, especially with the way the macOS
| and iOS frameworks are designed, relies a lot on MVC,
| delegates, and mutability, which gets hard to keep track of.
| Klonoar wrote:
| It's interesting you say that Swift leans hard into
| verbosity - I've found the opposite. Most of the verbosity
| often feels like it exists due to ObjC "lineage".
|
| (I say this as someone who likes verbosity)
| dceddia wrote:
| Interesting! Maybe we're using different definitions of
| verbosity too, or I used the wrong word there.
|
| I was thinking more about the long method names which
| definitely do feel like they're carried over from ObjC.
| And then, a lot of those names are really more from the
| commonly-used frameworks than the language itself, so
| maybe it's not fair to say that "Swift" has those long
| names, but it does feel like the 99% use case for Swift
| is using those frameworks.
|
| I think I agree that Swift felt like it needed less code
| to do a thing than ObjC would have, in a lot of cases.
| boopmaster wrote:
| I don't know why the first thing I thought of is remake version
| of "Shunsuke Kida's - Maiden in Black", from the Demon Souls OST,
| having all the silence and therefore tension and negative space
| taken away from it. I know, that's different than youtuber
| videos; but that's where my headspace went.
| PaoloBarbolini wrote:
| Awesome tool. I hope it gets a Linux release some day :)
|
| As a DaVinci Resolve user I'm amazed at how good the Smooth Cut
| transition is at hiding cuts if the head moved very little during
| the cut portion of the video. It might be worth exploring more
| and seeing if it would make sense as an optional flag.
| dceddia wrote:
| Maybe some day it will come to Linux! I think technically most
| of it would already work there, but some stuff requires native
| windowing API calls that I'll need to figure out and port
| over... plus all the install/distribution stuff... and the, uh,
| enhanced surface area for support haha. It would be awesome to
| get it running on Linux though.
|
| I haven't played with Smooth Cut, I'll have to check that out.
| It sounds handy! Might be something I could just "turn on" in
| the XML file too, I'm not sure.
| calvinmorrison wrote:
| Semi-relatedly, I am having a problem with my bluetooth
| headphones and audible. Apparently anytime there is more than a
| breaths of a pause, the module powersaves and then when it kicks
| back in you missed the first word or two of the sentence. Fine
| for some books, for others, it makes them very hard to listen to.
|
| My only idea so far was: some sort of app that generates very
| quiet noise so that it won't power save?
| dceddia wrote:
| Weird. Maybe one of those white noise generator apps? If the
| frequency were set super low so that it's below hearing range,
| maybe the headset would still think there's signal coming
| through.
| tylerchurch wrote:
| If you're on an Apple device, you can make the phone itself
| generate white noise by going to: Settings > Accessibility >
| Audio/Visual (under Hearing) > Background Sounds
| kthxb wrote:
| Can anyone here (maybe the author) say how fast this is? How long
| does it need for one video? Is it interactive even?
| dceddia wrote:
| Hey, author here :) It varies by the length of the video and
| some other things, but on a 2019 Intel MBP it's loading up a
| 35min file in about 5 seconds. After that, it's interactive -
| you can adjust the silence-finding settings in real time, and
| seek around and hit Play like any other video editor, and it'll
| play back while skipping over the silent parts.
|
| The slowest parts are (1) loading up the audio to find silence
| and (2) if you decide to export an MP4, encoding that.
| Exporting an XML timeline is near-instantaneous.
| hillcrestenigma wrote:
| This looks pretty cool. Might be too similar to
| https://jumpcutter.com/ though.
| yellowapple wrote:
| This program looks like it fulfills its purpose quite nicely.
| Well done!
|
| Unfortunately, that purpose happens to be something that
| absolutely drives me up the wall. Few things cause me to close a
| video and outright block its creator faster or more vigorously
| than cutting out pauses between sentences and phrases. It's great
| that your demo video recognizes that to be a problem, but even
| after accounting for it it's still jarring - and let's face it,
| just about zero users of your software are gonna account for it.
|
| What happened to the good old days of doing multiple takes and
| rehearsing?
|
| In any case, nice work on it, and I hope your customers use this
| power responsibly and unnoticeably :)
| mikestew wrote:
| _What happened to the good old days of doing multiple takes and
| rehearsing?_
|
| How well does that work on a live presentation before an
| audience, do you think? Say, a preacher delivering a Sunday
| morning sermon? PyGoSwiftCon 2023 tech presentation? "Here's
| last Saturday's video, can you post that to $WEBSITE?" One does
| not always have the luxury of a retake.
| yellowapple wrote:
| And in those contexts you can't exactly cut the pauses out,
| either. They're live; pauses and other "imperfections" are
| unavoidable, expected, and an intrinsic part of the
| performance - and if your next thought is "but what about the
| recording of it?", I can think of few things worse to do to
| the recording of a live performance than utterly butchering
| it for the sake of tiny pauses.
|
| In any case, I said "and rehearsing"; people can and do
| rehearse live presentations and sermons and other speeches.
| That's in fact a very common thing: write out what you're
| going to say (or pay someone to write it for you), rehearse
| it in front of friends or family or pets or your mirror,
| possibly even memorize it.
| programmarchy wrote:
| Great job identifying a niche and executing well on a focused
| feature set. That's something I wish I could do better!
| Animats wrote:
| How about one that also removes "uh", "er", "hi guys", ads, and
| beginning stretches where "I" predominates?
| G4E wrote:
| You have sponsorblock[0] to at least skip the ads and the
| remainders to "like and share and subscribe". It works really
| well.
|
| However, youtube premium might be the solution if you want to
| support your favorites channels without ads.
|
| [0]https://addons.mozilla.org/en-US/firefox/addon/sponsorblock/
| capableweb wrote:
| I think they are asking for the feature from the perspective
| of someone producing videos, not consuming.
| wccrawford wrote:
| No, they talked about removing "hi guys" and "ads", so
| that's definitely a consumer.
| userhacker wrote:
| You can use revoldiv.com to cut out filler words or any words
| of your choosing, after you upload your file and it finishes
| sound detection, you can click on the search box to bring up
| the toolbar to delete sounds
| dceddia wrote:
| That looks nice! Is it your tool? I wonder how it's
| supporting free transcription since most of the good APIs for
| it are pay-per-minute.
| userhacker wrote:
| Thanks yes it is, we implemented all the ai models in
| house, that cuts our cost.
| dceddia wrote:
| That's awesome, nice work!
| corrral wrote:
| So many potentially-interesting channels have lost me because
| two minutes into the first video I try, they're still talking
| about previous videos or irrelevant personal stuff or their
| subscriber count or WTF ever. Close tab.
| tiku wrote:
| I recently watched a video from a vlogger that cut her videos. I
| wondered why there wasn't a tool for morphing cut pieces of film
| out, so that you don't see the cut. So the transition in the cut
| piece needs to be generated with AI/deepfake.
| dceddia wrote:
| I feel like I saw a similar thing recently, that generated the
| "missing" video given 2 end points. Might've even been still
| frames? It was impressive. Can't remember where I saw it now,
| though.
| basch wrote:
| George Lucas was doing it in the Star Wars prequals. I also
| noticed it happen in Barry in the last episode or two, when
| he kneels down in a desert, his face and body appear
| seamless, but his hair dissolves between two shots.
| WrtCdEvrydy wrote:
| That's actually called a Morph Cut in most editing software.
| gyan wrote:
| Which license is the app under? The trial installer didn't show
| me any EULA.
| KaoruAoiShiho wrote:
| Is it possible for your app to take into account not just audio
| but also facial expressions? AKA do not cut out the parts where
| the speaker is silently making unusual faces or facial signals.
| dceddia wrote:
| Possibly! Not currently, for sure. It'll have (again, soon) a
| feature that lets you manually override Recut's choices though,
| so you could select a meaningful silent part and leave it in.
|
| I think at some point any sort of automation is bound to get
| something wrong and it'll likely never be perfect, so my goal
| is to add enough manual control that you're never just stuck
| with whatever the app decided.
| KaoruAoiShiho wrote:
| That's cool, hopefully it can be a feature request you take
| seriously, it may not even be that hard I think to throw in a
| facial recognition library or something. The issue with doing
| it manually is that I would have to basically scan the entire
| video manually and that sort of counteracts the purpose.
| JadoJodo wrote:
| First thought: This is really great! Based on the amount of
| YouTubers that I see do this to their videos, I can imagine this
| would be a really great tool, and I think it's awesome that you
| made it, OP.
|
| Second thought: Is anyone else really annoyed by the constant
| cuts in videos these days? I find it distracting at times and
| completely jarring in others. I've never made a video for
| consumption, so I could imagine there are a lot of "re-takes" +
| cutting out of "umms", but I just find it a bit sad that
| everything has to be SO clean-cut these days.
| varispeed wrote:
| I don't know. I have ADHD and I can't watch a slow paced video
| (unless there is some sort of tension built in that gets brain
| cogs spinning). I often watch videos at 2x speed and I wish
| YouTube had an option for 4x.
|
| If there are any pauses I quickly lose interest and go on doing
| something else and forgetting I even started watching
| something.
| jasode wrote:
| _> Is anyone else really annoyed by the constant cuts in videos
| these days? [...], so I could imagine there are a lot of "re-
| takes" + cutting out of "umms"_
|
| I agree it's jarring but I understand why it often happens:
| it's easier and faster to fix a vocal mistake by backing up
| _just a sentence or two_ and then re-record from there. So it
| 's not always just removing the pauses; he/she actually fixed a
| speech mistake in that sentence and had to splice it in.
|
| The alternative to avoid jarring cuts is to record _longer
| takes of reading paragraph-length word counts without a
| mistake_. This is much more difficult and time-consuming. So if
| you 're trying to speak 10 good sentences and you flub sentence
| #10, you have to start all over at sentence #1 to maintain one
| continuous take. Otherwise, you'd have a jarring cut between
| sentence #9 and #10. E.g. you can see the bloopers outtakes at
| the end of each Technology Connections video to see that even
| reading from a script without mistakes is not easy.
| geerlingguy wrote:
| Jump cuts, as they're called, can also be covered over by
| judicious use of b-roll and other supporting material.
|
| But again, like longer reads, it takes more time/work. So a
| lot of channels just leave the raw jump cuts.
| yieldcrv wrote:
| And by pre-splicing with this software it is telling you
| where to put the b-roll in, saving loads of time!
| justincormack wrote:
| The other way to do this is to move to another camera angle
| which hides the cuts, although you certainly don't want to
| do this if the cuts are very short like the ones in the
| example on teh website. Recording 4k footage you can cut
| multiple HD "angles" out of it, eg zooms which you can use
| instead of multiple cameras.
| dceddia wrote:
| Thanks! And yeah! It sounds super unnatural when even the
| tiniest pauses are cut out. Obvs I can't prevent anyone from
| using it that way, but I set the defaults to leave a good 1/2
| second of space on either side of each cut, and to leave in any
| silent chunks that are around 1/2sec or so.
|
| Personally I think of the silence as a heuristic - it's the gap
| between re-takes, so if I cut at those points and then delete
| the bad takes, it saves a ton of time.
|
| This makes me think an interesting workflow to support might be
| something like, set the settings tight to get all the cuts,
| then delete the bad takes, then bring the pauses back.
| Sohcahtoa82 wrote:
| I've seen 30-second videos with a cut after each sentence. It's
| incredibly jarring.
|
| If you can't give a 30-second spiel in one cut, then you keep
| trying until you do. Alternatively, you splice in some other
| graphic or video to hide when a cut happens.
| Gordonjcp wrote:
| 30 seconds is too long for a shot. You lose the audience
| after 15 seconds, especially if it's just a talking head.
| Imagine you're standing in front of me as I tell you all this
| - you're not looking right at my face, looking me right in
| the eye for the 15 seconds or so it took to get to this
| point. You're looking away. You look over my shoulder at the
| thing behind me, you look at the editing equipment on the
| table, maybe look at what's on my screen. We're up around 25
| seconds now, and your gaze has shifted at least half a dozen
| times.
|
| In video editing you mimic this by cutting away to other
| angles, or to illustrative shots. Right around the end of the
| first sentence I cut to a longer "two-shot" showing us
| talking. At "to get to this point" I cut to a head-and-
| shoulders shot of you nodding in agreement (a "noddy shot",
| done after my piece to camera, getting you to look at the
| right height to match my eyeline). On "you're looking away" I
| cut back to me, and then a shot of my PC on the bench with
| some editing software open (bonus points for having it
| showing an earlier shot from this). On "and your gaze", it's
| back to me.
|
| What you were actually looking at was cutting back and forth
| wildly, showing you something different every five to ten
| seconds, but somehow you didn't even see it move.
| jstanley wrote:
| Alternatively, you just edit out the silences. What's the
| problem?
| [deleted]
| dceddia wrote:
| From what I've seen and experienced myself, there's a
| definite learning curve to making videos.
|
| In the beginning I remember it being maddeningly hard to even
| get 10 seconds out without messing it up. But then also,
| there's a technique and a skill to getting the cuts to sound
| natural.
|
| I've seen plenty of YouTubers who cut after every sentence
| and make it sound & look natural, but plenty more who cut the
| same amount and it looks jarring. Keeping your head in the
| same spot helps. Trying to speak one full thought at a time
| helps too.
|
| The worst is when you get on a roll, get 30 seconds into your
| roll, and then completely lose it and can't remember where to
| "roll back" to.
| alar44 wrote:
| Some really well done channels are unwatchable for me due to
| this in tangent with using an audio compressor wrong (attack
| set way too long) makes it even worse from pumping.
| ryanmcbride wrote:
| >Is anyone else really annoyed by the constant cuts in videos
| these days I was initially but my brain has just accepted it as
| part of videos these days, I barely notice anymore unless it's
| really jarring, or done somewhere that doesn't make sense.
| tomcam wrote:
| I'm old. It bothered me at first, but now I view it as a
| service to the listener. Somehow I got used to the abruptness
| fairly fast and then became addicted to the brevity.
| danielvaughn wrote:
| I like the idea of cutting out silence, but editors need to
| understand that we need a _bit_ of a gap in order to consume
| the content. I feel like I 'm listening to a 10 year old with
| ADHD when you cut out literally any pause whatsoever.
| evv wrote:
| Recut lets you configure that with a little slider
| niels_bom wrote:
| As a person with ADHD: I get very easily distracted if the
| tempo of a video is low. I use the "Video Speed Controller"
| browser extension to run most videos at 1.5 to 2x speed.
| [deleted]
| throwaway81523 wrote:
| There is already an ffmpeg command line option for this, I
| thought. ffmpeg -af silenceremove=whatever.
| colechristensen wrote:
| I understand small time creators with lots of cuts. They're
| getting by on pretty small ad revenue and are amateurs not
| seasoned performers. The time and talent to have long takes
| without mistakes is hard, and I appreciate the content more
| than I want super polished production values.
| Gordonjcp wrote:
| > but I just find it a bit sad that everything has to be SO
| clean-cut these days.
|
| It's _not_ "clean-cut" though. Jump cuts in pieces to camera
| look absolutely shite.
|
| If you want to cut a bit out for pacing or to remove an "uhm uh
| <cough> so uh" then you cut away to something else. Maybe a
| close-up of what you're talking about, or to another camera
| angle.
|
| Just chopping a bit out so you hop about the screen looks
| amateurish as all hell.
| linsomniac wrote:
| Reading your reply I immediately thought of a video I watched
| last week that would have been a good 10 minute video, probably
| a great 5 minute video, but was in reality a 25 minute "stream
| of consciousness" video. Every step in the process he described
| ~3 times, some of them were because he was waiting for a long
| running process to complete, some I think were just habit?
|
| Much of YouTube is dealing with people who leave your video
| after a minute or two, so shorter is generally better.
|
| I very much appreciate people who put the time into getting rid
| of the superfluous in their videos. One of my highest
| performing videos is a 26 second "how to" video, the majority
| of comments are "Thank you for not making this a 5 minute
| videos like the others on this topic". (Removing the riving
| knife from a DeWalt table saw, FYI).
|
| I recently experimented going entirely the other way, probably
| went too much so. Can I teach using the Python Typer (CLI
| argument parsing) library in 60 seconds? Feedback from friends
| is "Man, that's DENSE!" https://youtu.be/1iO7wqnC7qw
| AlchemistCamp wrote:
| They might be referring more to the tiny font size than the
| pacing of the video.
| linsomniac wrote:
| They didn't mention font size ("moved a little fast" was a
| comment for example), but I appreciated that you did. :-)
| hardwaregeek wrote:
| Yeah one subtle detail I really like about RedLetterMedia is
| that they'll have these cuts that last juuust a little longer
| than normal. It gives a nice punctuation to the video that
| feels almost classical in form.
| corrral wrote:
| The difference between them and most YouTubers is that they
| came at it backwards--they understood video production
| _before_ they became YouTubers.
| tjfl wrote:
| I love this idea. I stumbled upon a Gist[0] from vivekhaldar[1]
| some time ago and it really helped out when I had to create a
| screen recording for a colleague. Definitely not as polished as
| Recut, though.
|
| [0]
| https://gist.github.com/vivekhaldar/92368f35da2d8bb8f12734d8...
| [1] https://www.youtube.com/c/vivekhaldar
| dceddia wrote:
| Neat idea with using colors to edit!
|
| My first stab at something before Recut was a Node script that
| did a similar sort of thing, just based on silence though,
| using ffmpeg's silencedetect to find silent parts and
| generating a cut list. And then as soon as it worked, I was
| like "well that's cool but I really want it to be interactive"
| and then... well it turned out that making a UI video editor
| was way harder than that script haha, but eventually Recut came
| to exist.
___________________________________________________________________
(page generated 2022-06-16 23:00 UTC)