[HN Gopher] A library for audio feature extraction, regression, ...
       ___________________________________________________________________
        
       A library for audio feature extraction, regression, classification,
       segmentation
        
       Author : nothrowaways
       Score  : 90 points
       Date   : 2021-12-09 10:38 UTC (2 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | jaflo wrote:
       | I used this for a personal project [1] a couple of years ago to
       | shorten audio files without making them sound like they were cut.
       | This is done by removing repeated sections like replacing two
       | choruses with one. I initially wanted this to "resize" background
       | music to match video footage I had, but it is kind of fun to just
       | mess around with songs too (like those content aware scale
       | picture memes, but to create the shortest possible audio).
       | 
       | I think for my use case specifically, the library was kind of
       | overkill though and something like librosa [2] would have been
       | enough for feature extraction.
       | 
       | 1: https://projects.loud.red/snipsnip/
       | 
       | 2: https://librosa.org/doc/latest/index.html
        
       | terhechte wrote:
       | Interesting, I've recently done a bit of searching in this space
       | to find a project that would fit for an idea I had: I'd like to
       | use a raspberry pi zero w to listen for our doorbell. If the
       | doorbell rings, it should do something (e.g. send an sms or turn
       | on a light).
       | 
       | I couldn't really find anything, does someone know if a project
       | like this exists? For the one listed here, I'm not sure if it is
       | fast enough to run on a slow device like the W? Also, would it be
       | able to detect audio in a continuous stream from say a
       | microphone?
        
         | achn wrote:
         | Or, you know, just wire in the doorbell button and be done?
        
         | foo_barrio wrote:
         | If your doorbell is electric and plays a recording of chimes,
         | it can be very straightforward to implement this yourself. Just
         | off the top of my head I use FFTs (fast Fourier transform) of a
         | known recording of the door bell limited to certain
         | frequencies, normalized etc and compare it to the audio stream.
         | This can be done in real time without any hardware
         | acceleration. You can also go a bit further and implement
         | something similar to the shazam algo.
         | 
         | If it's an "analog" door or a buzzer it will be trickier.
        
         | rvense wrote:
         | You're looking for a single-bit stream of information and very
         | likely you can find it as an electrical signal inside your
         | doorbell already.
         | 
         | I wanted to replace the sound my wireless doorbell made so I
         | took the basestation apart and it was a very simple thing, with
         | three chips: a radio (NRF51), a microcontroller (PIC) and a
         | blob of epoxy on a separate board that was connected to the
         | speaker. It took maybe half an hour of beeping and scoping to
         | understand how the PIC and the sound maker communicated - in
         | this case five pins to select one of 32 sounds, and one pin to
         | trigger playback. I simply took the playback trigger pin and
         | connected it to a small MP3 player module and moved the speaker
         | from the internal sound maker to that.
         | 
         | If you can just attach wires to the button directly, it's even
         | simpler.
         | 
         | Of course, if the object is to use a pi zero to do some DSP,
         | this is missing the point. But there's a good chance it's the
         | long way round if you want to solve the problem of knowing when
         | somebody is at your door.
        
         | garblegarble wrote:
         | Same here, what I want to do is detect my dog barking
         | excessively at people/cats/birds on the street and trigger my
         | curtains to close for a few minutes... I've already got a wired
         | camera in there so processing the audio seems easiest
         | technically, but I can't help but think it's a really crazy
         | waste of CPU time (even though it will be good for my
         | neighbours).
         | 
         | I'd wondered if computing peak volumes per second would be a
         | good enough proxy, then trigger action if the threshold is
         | exceeded more than n times in 15 seconds... certainly seems
         | like it should be way less compute intensive!
        
       | beepbooptheory wrote:
       | I was in the market for one these and ended up with yaafe [1],
       | which is a little older, but has, IMO, a better api, more
       | flexible output, and c as well as python bindings.
       | 
       | Also, the documentation is rather good, with links to the various
       | papers for each algorithm. The above library, in contrast, is
       | little impenetrable for me.
       | 
       | I'm using this with postgres and supercollider for more of an
       | artistic project though, so YMMV.
       | 
       | 1. https://github.com/Yaafe/Yaafe
        
         | Jugurtha wrote:
         | > _I 'm using this with postgres and supercollider for more of
         | an artistic project though, so YMMV._
         | 
         | Do you mind telling us more about this project?
        
       ___________________________________________________________________
       (page generated 2021-12-11 23:00 UTC)