[HN Gopher] Lessons learnt building a real-time audio applicatio...
       ___________________________________________________________________
        
       Lessons learnt building a real-time audio application in Python
        
       Author : spmvg
       Score  : 44 points
       Date   : 2024-09-09 14:18 UTC (8 hours ago)
        
 (HTM) web link (www.vangemert.dev)
 (TXT) w3m dump (www.vangemert.dev)
        
       | camtarn wrote:
       | "It turns out that the round-trip time from an audio interface,
       | through a computer (DAW) and back to the speakers takes a few
       | hundreds of milliseconds, making direct audio processing
       | impossible using consumer hardware." - uh, what? Real-time audio
       | processing has been a thing for at least a couple of decades. It
       | doesn't work by default on Windows, but you can get free drivers
       | (ASIO4All) which make it work on pretty much any hardware. And it
       | works out of the box on Macs.
       | 
       | "Latency seems to shift by a few tens of milliseconds when
       | restarting the application." - this makes me think you are using
       | the wrong API for your sound input/output. With modern realtime
       | audio support, your total latency from input to output should be
       | less than 10ms _total_.
       | 
       | "I expected that memory usage would get out of hand quite fast
       | due to the ever growing dictionary of arrays containing audio
       | data, but this does not happen in practice. I suspect that the
       | good performance is caused by highly optimized memory management
       | of Python and modern OSes." - without concrete figures it's quite
       | hard to evaluate this, but what did you expect to happen? With a
       | 44.1KHz stereo audio stream, you should be storing 88.2 thousand
       | samples a second. Say you're using 64-bit floats, as a worst
       | case. Your audio storage should be growing at about 689KB/sec,
       | plus a bit extra for object overhead. How much is it actually
       | growing by? Of course Python is probably doing a bunch of
       | allocation and deallocation for temporary objects behind the
       | scenes, but hopefully you should not need to lean too hard on
       | 'highly optimized memory management' - ideally, you should hardly
       | be allocating anything at all. Also, why a dict, rather than just
       | a large array that you can occasionally make bigger?
       | 
       | Finally ... I'm sure you already know that Python is possibly the
       | worst mainstream language you could pick for realtime audio
       | processing. But that is fine. I have tried to build audio stuff
       | in Python too! Sometimes using the wrong tool for the job is part
       | of the fun.
        
         | PhunkyPhil wrote:
         | +1. In Ableton on Windows you can get your latency down to
         | ~40ms without a dedicated sound card using ASIO. Mac's drivers
         | are even better with sub ~20 ms on my m2 pro IIRC.
        
           | shannonclaude wrote:
           | +1 to the comments here. Part of the issue here is running
           | these applications in Python. It's not really optimized to
           | handle these loads and do DSP-based compute efficiently.
        
           | jancsika wrote:
           | > Mac's drivers are even better with sub ~20 ms on my m2 pro
           | IIRC.
           | 
           | Just to be clear that you're measuring apples to apples with
           | OP:
           | 
           | You are measuring less than 40ms _roundtrip latency_ on your
           | Mac. Is this correct?
        
         | spmvg wrote:
         | Interesting comment! I'm going to figure out if using another
         | driver allows me to get under 20 ms in latency. Right now I'm
         | measuring around 300 ms in latency round-trip, which is not a
         | problem because I can correct for it. (I'm using a Focusrite
         | Scarlett 2i2 with default drivers.)
         | 
         | The reasoning behind my comment about round-trip time was as
         | follows:                 - Right now I'm measuring around 300
         | ms round-trip time, without processing inbetween       - In the
         | past I've tried to do live effects in Ableton with ASIO drivers
         | (guitar in -> Ableton effects -> out), and the delay was too
         | noticable. I couldn't play that way without making my ears
         | bleed and I've switched back to pedals since.
         | 
         | One follow up: how could I achieve a total round-trip latency
         | of around 10 ms total, as you describe? If I use a buffer of
         | 500 samples @ 44.1 kHz, then I am spending already 11 ms just
         | filling the buffer. So then the buffers need to become really
         | small, causing more processing overhead, right? Not sure if
         | this is the way to go.
        
           | camtarn wrote:
           | Yeah, your Scarlett should be capable of single-digit ms
           | latency. If you're on Windows, you need to install its ASIO
           | drivers and figure out how to use them from Python. Then,
           | yes, use tiny buffers and run your audio processing very fast
           | - which is where Python's slowness will probably become a
           | real problem.
           | 
           | 10ms latency is how long sound takes to travel 3-and-a-bit
           | metres. So if your amp is a few metres from you, you would
           | experience that delay between hitting the guitar strings and
           | hearing the amplified sound. This should barely be
           | noticeable. If you were noticing a delay greater than that in
           | your Ableton effects setup, your settings needed tweaked. All
           | of this is completely possible - I had a PC-based electronic
           | drum setup in 2006, running through the Reason DAW, which had
           | 8ms latency between hitting a pad and hearing the result.
           | 
           | Hmm, I wonder if Cython (static Python-to-C compiler) would
           | make writing audio code easier/more possible?
        
             | spmvg wrote:
             | With Ableton and the default ASIO configuration on my
             | Scarlett I get 96 ms combined input+output latency without
             | any processing in between, so that's probably what made my
             | ears bleed before. Tweaking the sample rate and buffer size
             | gets me indeed single digit latencies in Ableton. So I'm
             | definitely going to adjust the section about latency,
             | thanks for this!
             | 
             | I'm a bit on the fence about what this means for the
             | difficult latency calibration routine in the application.
             | Ideally I could throw the calibration routine away, but
             | then I require that users have ASIO installed, while the
             | app now also works with non-ASIO drivers. And indeed Python
             | itself might become a bottleneck (making this work in
             | Python has been half the fun).
        
               | sim7c00 wrote:
               | try clarett interface. it also comes with pre amps which
               | will make your sound less noisy , scarlet preamps are
               | just absolutely terrible. you can debug your daw to see
               | how it uses drivers and make a python module which
               | exposes similar functions to python. you will likely
               | still want a delay compensation to make things seem free
               | of any latency, but it will be doing _much_ less
               | compensating. maybe theres an opensource daw if you want
               | to skip reversing driver calls from a debugger.
        
               | spmvg wrote:
               | Debugging an existing DAW to see how they do it under the
               | hood is an interesting idea. Haven't done that yet.
               | 
               | About another interface: I do want to keep the
               | application supporting cheaper interfaces such as the
               | Scarlett, because the target audience (hobby musicians)
               | will be using those. Still would be a nice upgrade for
               | me!
        
               | kibibu wrote:
               | Can take a peek at how Tracktion engine does it too
        
               | bongodongobob wrote:
               | I would disable any services and programs running in the
               | background as well. Years ago I disabled the Windows
               | print spooler and it greatly improved jitter. Not sure if
               | that's still the case these days though, that was
               | probably 10 years ago.
        
               | spmvg wrote:
               | So far CPU usage hasn't been an issue at all (<1% usually
               | on my not-very-impressive laptop), which surprised me as
               | well
        
               | dist-epoch wrote:
               | Even without ASIO you should be able to hit 40 ms latency
               | on pretty much any Windows audio hardware, including
               | motherboard built-in.
               | 
               | If you get 300 ms you're doing something wrong. Note that
               | Windows has multiple audio APIs, 300 ms is about the
               | latency of the old MME api, you need to use the newer
               | one, WASAPI.
        
               | spmvg wrote:
               | I apparently only have the old Windows MME drivers indeed
               | (and ASIO, on Win10). Need to look into why I can't find
               | WASAPI and if I can assume other Windows users have those
               | by default.
        
               | michaelrmmiller wrote:
               | WASAPI has been available since Windows Vista. It isn't
               | its own set of drivers but rather a unifying layer for
               | the WDM driver and the preceding mishmash of Windows
               | audio APIs (MME, DirectAudio, etc). WASAPI supports low
               | ish latencies with Exclusive Mode and then something like
               | 10ms buffering in Shared Mode through the Windows audio
               | server, I recall.
               | 
               | Put another way: any Windows audio device supports WASAPI
               | unless it only ships with an ASIO driver which is
               | unlikely, even in the pro audio space.
        
               | ubercore wrote:
               | I don't know windows audio, but on mac audio that's
               | wildly high latency for a scarlett interface.
        
       | nicholasjarnold wrote:
       | I was once bitten by not understanding that there is a difference
       | between "regular" clocks and high performance clocks/timers that
       | a developer can take advantage of. At the time I needed a
       | sampling routine to run at precisely once per second. My
       | inexperience led me to go with something like thread.sleep(1000),
       | and I learned quickly that I was mistaken in thinking it'd run
       | with little jitter. As others are pointing out, there are also
       | similar lessons and solutions when dealing with audio processing
       | pipelines.
        
         | spmvg wrote:
         | Indeed, it is not a guarantee that the "sleep" will be exactly
         | that long. In the code I'm not "sleeping" in any sensitive
         | places, instead I'm relying on the callback to the audio stream
         | object, which just needs to finish before the next one starts
         | (less of a timing constraint).
        
       ___________________________________________________________________
       (page generated 2024-09-09 23:01 UTC)