[HN Gopher] Show HN: Improved freemusicdemixer - AI music demixi...
___________________________________________________________________
Show HN: Improved freemusicdemixer - AI music demixing in the
browser
Hi HN, Last time I showed free-music-demixer, which people seemed
to enjoy. It was a static website with a Javascript + WASM module
to perform music demixing (or music source separation) using an AI
model UMX-L (Open-Unmix) running client-side in the browser. Since
then, I have overhauled the project and made several improvements:
- The demixing/separation quality is higher now, since I
implemented the missing post-processing step - Memory usage is
lower now by performing a custom segmented inference with a
streaming LSTM, which should allow larger tracks (or, dare I say,
arbitrarily-large tracks) - There is a batch upload feature now to
demix an entire folder of songs (and provide zip files of the
stems) - There are now dev logs printed to the website to show the
progress better
Author : sevagh
Score : 111 points
Date : 2023-09-14 11:57 UTC (11 hours ago)
(HTM) web link (freemusicdemixer.com)
(TXT) w3m dump (freemusicdemixer.com)
| dylan604 wrote:
| after finally getting some free time to play with this, i did.
| i've used other non-AI based programs to remove vocals. the in
| sound is totally different and obviously beyond because they are
| stems. There's the tell-tale sound that I've heard from all of
| these generative AI audio things that sounds like a very heavily
| compressed version of the original. To describe it visually, it
| looks just like taking an image and removing part of the image
| and then using a generative fill. Compare how the fill looks to
| the rest of the image, and that's what this sounds like. Similar,
| but just smeared and not clean. Maybe it's not noticeable to
| people that don't work with audio, but it is one of those things
| that once you're trained to hear it, you can never not hear it.
|
| it is probably the most useful application of the AI things I
| have seen AND it does as advertised on the box. nice work on the
| project.
| sevagh wrote:
| Thanks for the kind words!
| fragmede wrote:
| Labeling the 4 parts Vocal | Melody | Bass | Drums, instead of
| Drums | Vocals | Bass | Other would go a long way to making it
| seem less programmer-y.
| sevagh wrote:
| Nice idea! The name 'other' comes from the traditional datasets
| used to train these models, but I'm open to trying to
| incorporate a better name.
| catapart wrote:
| Man, this is fucking awesome. Thank you for building this!
|
| Not just free, but local, without installation, and generalized
| to common use-cases too? This is a standard of development that I
| aspire to, so thank you for being a great example!
|
| Dev appreciation aside, I also record music with my long-distance
| buddy, so we often find ourselves trying to use midi recreations
| in order to get at least a passable version of timing and range
| that we can both practice from until the demo compositions have
| been laid down. It's often pretty far off the mark from the
| original track, so we will make fantastic use of this utility.
| Again, thanks so much!
| phpnode wrote:
| Just tried this and it seems to get stuck at the following step
| (for 20 mins so far)
|
| [WASM/C++ 18:07:24] Getting waveforms from istft [WASM/C++
| 18:07:25] Copying waveforms
| sevagh wrote:
| Hmm - that should be the end of it, at which point download
| links will appear to either the Single or Batch apps, depending
| on which one you used:
|
| - bass, drums, vocals, other, karaoke.wav in the Single track
| window (at the bottom: Demixing outputs...)
|
| - song_1.zip, song_2.zip, ... in the Batch window (at the
| bottom: Batch outputs...)
|
| Like so: ``` Demixing outputs... bass.wav drums.wav other.wav
| vocals.wav karaoke.wav ```
|
| There should also be a Javascript message on the left like
| "Preparing zip" or "Preparing stems for download"
| phpnode wrote:
| ah out of memory, I'll try with a shorter track.
| sevagh wrote:
| Oh no! Can you open a github issue with the size of track
| you tested? Or email the track to me at
| sevagh@protonmail.com
|
| I can make the segment size smaller (right now I'm using
| 60s segments).
| Spiwux wrote:
| Is this the best possible demixing you can get? Or did you have
| to use a smaller / lower quality model to make it run in
| browsers?
| sevagh wrote:
| In my first post, quite a lot of alternatives were discussed:
| https://news.ycombinator.com/item?id=36707877
|
| The model I'm using is called Open-Unmix
| (https://github.com/sigsep/open-unmix-pytorch). In 2021, there
| was an update to Open-Unmix to include new weights, UMX-L,
| which made it perform better than it used to on the older
| weights (UMXHQ).
|
| In the grand landscape of music demixing, I don't think UMX-L
| is near the top anymore.
|
| _However_, the demixing performance of freemusicdemixer.com is
| very close to the full PyTorch performance of Open-Unmix UMX-L,
| despite the tricks I needed to get it working in the browser,
| such as splitting up the inference to operate on segments of
| the song, or making the LSTM operate on streaming segments
| rather than holding the entire track in the LSTM memory.
|
| In my first release, I loaded and did inference on the entire
| track at once (like the PyTorch model), which frequently
| crashed or exceeded the 4GB WASM memory for medium or large-
| size tracks.
| fdgjgbdfhgb wrote:
| It's super exciting to see people using open-unmix like this!
| I worked with the creator of the model to try to do the same
| as a university project! Our solution was... not great, to
| say the least, but I'm happy someone else managed to do it!
| sevagh wrote:
| That's awesome! I first got in contact with Fabien-Robert
| (https://github.com/faroit) during the Music Demixing
| Challenge 2021, where the UMX-L weights were first
| unveiled.
|
| We have since discussed my projects a couple of times! I
| even got the idea for a streaming LSTM from him.
|
| I think music demixing in general owes a lot of thanks to
| Open-Unmix and co (https://github.com/sigsep), who have
| relentlessly been publishing open-source models and related
| code (source separation metrics, dataset loaders, etc.) for
| years, and who blew the industry open with their MDX 21 [1]
| and SDX 23 [2] AI challenges.
|
| [1]: https://www.aicrowd.com/challenges/music-demixing-
| challenge-...
|
| [2]: https://www.aicrowd.com/challenges/sound-demixing-
| challenge-...
| alok-g wrote:
| For those interested, Facebook's Demucs page
| (https://github.com/facebookresearch/demucs) gives performance
| comparison for several models including open-unmix.
|
| See also: https://www.stemroller.com This runs as a local app on
| Windows and Mac.
| sevagh wrote:
| My dream (and next major project) is to get Demucs (v4, Hybrid
| Transformer) running in WASM, similar to what I've done with
| Eigen/C++ and Open-Unmix.
| rrherr wrote:
| Oh yes, that would be amazing! Good luck!
| input_sh wrote:
| By my own little comparison in which I've given the same song
| to both, Demucs appears much better (while requiring an app and
| consuming far more CPU).
|
| Thanks for sharing! I had no idea this was a thing.
| [deleted]
| adzm wrote:
| demucs works far better than anything else I've used,
| especially with more esoteric kinds of music in my
| experience. plus you can run it with GPU support as well!
| sevagh wrote:
| Demucs is awesome no doubt.
|
| >plus you can run it with GPU support as well!
|
| Open-Unmix also originally runs on the GPU like it was
| intended for, since it uses PyTorch just like Demucs.
|
| I'm curious about using WASM + WebGPU to add GPU
| acceleration to this project, though.
| qingcharles wrote:
| This is great, thank you. I've been using another site to do
| this, but it requires upload to the cloud.
|
| I use it for removing background music from movie clips so I can
| remix them and add alternate background.
| dylan604 wrote:
| "Runs locally in your browser!
|
| Unlike similar products, it's free to use and doesn't store your
| data. All processing is done in your browser, and your files are
| never uploaded anywhere. It runs well on computers and very
| slowly on smartphones; user beware."
|
| I'm impressed by the fact that this was the choice made. I can
| only imagine it also helps keep the operational costs down, as
| well as liability for copyright and what not since they never
| become in possession of the content. However, it also means they
| "lose out" on a possible continuous source of training data which
| other less ethical evilCorp type companies would not pass up on
| sevagh wrote:
| I'll be honest with you - I wouldn't even know what to do with
| user data if I collected it. I'm not great at training neural
| networks or understanding how to augment training with
| additional data.
|
| Basically, the _only_ special thing about freemusicdemixer is
| that it runs client-side, because I'm better at writing C++ for
| a pre-trained network than I am at training new neural
| networks. However, it's a cool advantage so I'll keep promoting
| it as the distinguisher of my "product" (since I didn't create
| or train the underlying neural network, I just consider it a
| clever WASM frontend).
|
| The operational costs are precisely $0 since it's a static
| website (aside from the domain registry).
| tkp wrote:
| It is very nice to be able to compute this in our browsers
| without limits or registration. The results are not always
| perfect but already allows having good time testing remixes
| or doing karaoke with friends! Thanks for sharing your work.
| sevagh wrote:
| I envisioned my site would be useful for quick and dirty
| prototyping - especially since the outputs are not licensed
| for commercial use.
|
| Then, when one's project idea is validated, they could then
| used a paid stem separation service with higher quality and
| commercial licensing.
|
| That's why I added some sentences on the website to solicit
| partnerships for targeted advertisements, to see if any pro
| demixing companies were interested.
| dylan604 wrote:
| That's an interesting idea. Scratch track stems while
| we're waiting for the approvals and delivery of final
| assets.
| gabereiser wrote:
| Keep that mindset. Please. User data shouldn't _be_ your
| product. This is perfectly acceptable and thank you for doing
| this.
| lh15 wrote:
| Are there any tools that will add musical accompaniment to a
| vocal track? Basic strings/piano?
| nre wrote:
| While this might seem to be intended for DJ remixes a demixer
| also lets you jam along to your favorite songs. For instance you
| can extract only the voice, bass, and drums and play along with
| your guitar like you were a part of the band.
| npsimons wrote:
| > While this might seem to be intended for DJ remixes a demixer
| also lets you jam along to your favorite songs.
|
| Not to mention arranging or just notating songs.
| sevagh wrote:
| Maybe I can ask HN out loud how I can start figuring out how to
| get this website on Google search results -
|
| e.g. I want it to show up when I search relevant terms like "free
| stem separation" "free music separation" "free music isolation"
| "music stem separation" "music source separation"
| winternett wrote:
| Add meta keyword and description tags and detailed text (HTML
| Tags) to each page you want indexed and post your app page URLs
| across social media sites, then give it a little time,
| indexing, especially among lots of competition won't happen
| instantly. :/
___________________________________________________________________
(page generated 2023-09-14 23:01 UTC)