[HN Gopher] The Linux audio stack demystified
       ___________________________________________________________________
        
       The Linux audio stack demystified
        
       Author : ruffyx64
       Score  : 93 points
       Date   : 2024-07-22 14:58 UTC (4 days ago)
        
 (HTM) web link (blog.rtrace.io)
 (TXT) w3m dump (blog.rtrace.io)
        
       | ruffyx64 wrote:
       | Wrote this blog article as I needed to get a better understanding
       | of the audio stack on Linux (esp. PipeWire, PulseAudio, ALSA,
       | etc. ...). The article turned out to be a lenghty in-depth
       | explanation of how audio works, how digital audio works, and what
       | sound servers on linux actually do. Tried to write it in a way so
       | it is accessible and understandable for beginners but also
       | enlightening for experienced users. Hope it's helpful to HN
        
         | brudgers wrote:
         | My experience:
         | 
         | I'm interested in how Linux Audio works. The first half of the
         | article covers other topics. It could be a separate article. An
         | article focused on Linux Audio could say "For audio basics,
         | click this link to my article on Audio Basics."
         | 
         | Even for beginners, that's useful because even beginners just
         | want to get sound out of their speakers and anatomy and physics
         | lessons are in the way. It's ok to start with ALSA. There's no
         | need to boil the ocean.
        
         | chung8123 wrote:
         | I really appreciate blogs/articles like this. It really helps
         | me get beyond the surface on things and I always learn
         | something. Thanks for taking the time to share.
        
         | Modified3019 wrote:
         | Learning by trying to teach is probably the best way to clarify
         | and crystallize what we think we know. Always appreciate these
         | kind of posts, especially since they tend to shine a line on
         | all the contextual bullshit that experts take for granted.
         | 
         | Right now I'm doing the same for an identification/contextual
         | guide of local weeds and insects for seasonal scouts (I'm an
         | agronomist). Unfortunately I find complexity tends to quickly
         | become fractal and highly interlinked and it's hard to set an
         | entry point or tell when to limit scope.
         | 
         | I think you've done a great job of doing just that.
        
       | anvuong wrote:
       | Thanks for the nice writing. But do you have any insight on why
       | is bluetooth audio so clunky on Linux? I'm using a pair of Sony
       | XM4 and I have never had any problems on my 4 Windows machines.
       | But on Ubuntu (both 22.04 and 24.04), I have had to jump through
       | many hoops, from editing a bunch of config files, changing kernel
       | flags, disable and enable a bunch of things I don't understand
       | (mostly from reading Arch Wiki), just to get it working _some_ of
       | the times. Some days it will just outright refuse to connect,
       | sometime it connects but not playing anything (switching audio
       | device to it generates some undecipherable error logs), and
       | (probably worst) sometime it connects very quickly but stay
       | locked in low fidelity mode instead of a2dp sink. I 'm so fed up
       | that I just switched to wired headphones every time I use my
       | Ubuntu.
        
         | cogman10 wrote:
         | It's so clunky, IMO, because bluetooth is a dumbass protocol
         | with things in the standard that should not be there (including
         | which audio codecs are supported with which levels of
         | bluetooth). Rather than just being a more simple network of
         | wireless devices, it's a very complex protocol which makes
         | everything more complicated.
         | 
         | Why you may struggled could be anything from the firmware blob
         | for your bluetooth device, to the kernel driver installed, to
         | bluez, to the sound server you are using. Any one of those
         | things messing up will lead to a bad experience.
         | 
         | I've had a relatively good experience with kde-plasma's
         | bluetooth management stuff. But I still have to do dumb things
         | like manually selecting which audio codec to use when I go on a
         | call.
         | 
         | How could bluetooth be better? It should be at least 2
         | standards. 1 defining the wireless data transfer and network
         | capabilities, a second which defines how a computer negotiates
         | with a device to send audio. It shouldn't be 2 standards merged
         | together like it currently is. Wifi Direct is more what
         | bluetooth should be.
        
         | disinterred wrote:
         | I use arch linux and have never had an issue with pairing
         | bluetooth with anything. In fact, imho, it works much smoother
         | than Windows because I keybind bluetoothctl to connect to any
         | bluetooth headphones, speakers, keyboard or whatever
         | automatically using their bluetooth device IDs. To do this you
         | must first pair them (I use the blueman-manager gui) and then
         | get their bluetooth device ids and keybind the bluetoothctl
         | command. All of this is easy to do by asking ChatGPT. Hope this
         | helps.
        
           | ssl-3 wrote:
           | I've never done much with Bluetooth under desktop Linux, but
           | that sounds like a woeful pain in the ass compared to the
           | usual steps for Android or Windows:
           | 
           | 1. Pair headphones in a couple of clicks/taps; sound comes
           | out.
        
             | jauntywundrkind wrote:
             | You can just pair as usual, yes, like any other OS, via a
             | similar gui. And the device will then reconnect in the
             | future.
             | 
             | What the parent is describing is an advanced flow, that can
             | be helpful if you have lots of computers & need to juggle
             | bt devices.
             | 
             | Setting up a hotkey just takes _pre-work_ to setup. _This
             | workflow is optional._ But it saves time  & effort _if_ for
             | some reason you are one of the very few users who moves
             | devices around a lot.
        
               | Izkata wrote:
               | A hotkey is more work than GP is describing. Pairing is a
               | one-time thing, after that they connect automatically
               | when the headphones are on and nearby.
               | 
               | ...which, also, is exactly what mine do with Ubuntu. I
               | used bluetoothctl to pair them once when I first got
               | them, and when I turn them on Ubuntu automatically
               | connects and switches the audio over. I don't have the
               | same model headphones as GGGP, so I'm guessing it's a
               | problem specifically with that model's implementation
               | (Edit: or from another person who has the same model and
               | no issues, perhaps some combination of hardware/software
               | specific to that user).
        
               | jauntywundrkind wrote:
               | I think we're actually somewhat in alignment, but when
               | you say
               | 
               | > _Pairing is a one-time thing,_
               | 
               | You ignore the two scenarios I face regularly, that stem
               | from me having lots of devices and lots of computers &
               | wanting to switch around what's paired to what.
               | 
               | We both seem to be trying to defeat the notion that using
               | Bluetooth in Linux is hard or special (it's not at all,
               | it works like anywhere else, and these reports of it
               | being hard are from people with _at best_ extremely small
               | domains of experience  & knowledge).
               | 
               | I was trying to add that Linux has further upsides for
               | when you do want to go further, and highlight & interpret
               | the parent post to show how I have those issues &
               | describe how adding hotkeys (something only Linux does)
               | would help me, an advanced user juggling many systems &
               | device. I've clarified my post to mention that auto-
               | reconnecting will just work on most scenarios (but I get
               | why some folks might think it's cool to have hotkeys).
        
             | LtWorf wrote:
             | Yes the couple of clicks is the pairing. You have to pair.
        
               | ssl-3 wrote:
               | Then this keybinding and device ID management business
               | accomplishes what, exactly, other than exercising extra
               | steps?
        
         | jpeloquin wrote:
         | I also have XM4's and they worked fine on Arch after addressing
         | two problems:
         | 
         | Do you dual boot? Different OS's on the same computer will
         | generate different pairing keys even though they share the same
         | MAC, and this will cause connection issues. Usually that's
         | reported as having to re-pair every time you switch OS's
         | though.
         | 
         | https://unix.stackexchange.com/questions/255509/bluetooth-pa...
         | 
         | I've also experienced audio skipping & popping using a dual
         | WiFi/Bluetooth card that were eliminated by disabling WiFi.
         | Apparently the Linux driver was faulty and allowed some
         | interference; the card worked fine on Windows.
        
         | pdw wrote:
         | Debian does not ship the AAC codec, due to legal quagmire
         | surrounding the necessary code. The same probably goes for
         | Ubuntu. That might be the cause of at least some of your
         | problems. https://tookmund.com/2024/02/aac-and-debian
        
         | self_awareness wrote:
         | > I'm using a pair of Sony XM4 and I have never had any
         | problems on my 4 > Windows machines. But on Ubuntu (both 22.04
         | and 24.04), I have had to > jump through many hoops [...]
         | 
         | I also have XM4's (best headphones in my life; seriously,
         | they've saved my sanity and lowered my stress levels, more than
         | a few times), but I never had any problems with BT pairing. I
         | use them with my phone, Ubuntu, OpenSUSE, ArchLinux and macOS,
         | although not Windows, and they always pair up perfectly fine. I
         | have two-device mode activated at all times.
         | 
         | My SO uses them (she has her own XM4's) with Windows and her
         | phone, and also never had any problems.
         | 
         | Maybe it's a hardware issue?
        
         | LtWorf wrote:
         | I have no issues with bluetooth. Just click on the device,
         | associate and then it works. After the 1st time just being on
         | is enough.
        
       | ladzoppelin wrote:
       | "Professional audio will typicall utilize 24-bit. Everything
       | higher than that is usually bogus. Bogus where only audiophiles
       | will hear a difference." Does he mean internal DAW bit rates like
       | 64/32bit float are bogus, I am probably reading it wrong ?
        
         | swatcoder wrote:
         | I read them as talking about _listening_ , as represented in
         | mentioning audiophiles.
         | 
         | The extra depth/range available in DAW's are useful for effects
         | processing, mixing, and mastering and are a little colored by
         | trying to squeeze max-performance DSP on a general-
         | purpose/commodity CPU. I just don't take them as talking about
         | that here though.
        
           | tialaramex wrote:
           | And the bits are basically free. If we had very cheap 24-bit
           | floats and nothing bigger, maybe we'd use those, but we've
           | got cheap 32-bit floats, so those are fine.
           | 
           | The most important property of floating point is "infinite
           | headroom". In integer space, sixteen times quieter means 4
           | fewer bits of audio, get the levels wrong badly enough and
           | people can hear your mistake even if you fix it later - but
           | in float space it barely makes any difference, so long as the
           | levels are correct in the final consumed audio nobody cares.
        
             | creeble wrote:
             | "16 times quieter" is not 4 bits.
             | 
             | "Half volume" is subjective, and for music is typically
             | between 6 and 10dB (most US audio engineering classes use
             | 10dB).
        
             | PaulDavisThe1st wrote:
             | We would NOT use 24 bit floats since that would make them
             | less than ideal at matching the hypothetical (and almost
             | certainly never reach) 24 bit resolution of integer DAC/ADC
             | hardware.
             | 
             | The reason why 32 bit floats work great is that they can
             | handle a 24 bit integer without any loss, and then if for
             | some reason the values get kicked up above the maximum you
             | can represent there, you get subtle noise rather than heavy
             | distortion.
        
         | hatthew wrote:
         | If you listen to an audio file at 24 bit vs 64 bits (bit depth,
         | not bitrate), you won't notice a difference. However, if you're
         | manipulating audio in a DAW or similar, it's possible for noise
         | to end up amplified in the final output, so a higher bit depth
         | could make a difference.
         | 
         | Think of it this way: every time you add a filter or any type
         | of audio manipulation in your DAW, you're discarding some
         | information and replacing it with noise (how much depends on
         | what manipulation you're doing, but it's almost always >0). If
         | you start at 24 bits and then don't manipulate anything, it's
         | all good. But if you start at 24 bits and then lose 10 bits of
         | the true signal, you're down to just 12 bits of information.
         | But if you start at 64 bits, you can lose 40 bits before you
         | start to notice anything (or really it depends quite a lot on
         | many different factors, but in general there's a threshold
         | where noise goes from "not noticeable" to "noticeable" and it's
         | probably usually between 8 bits and 32 bits).
         | 
         | Don't quote me on the details (I am not an audio engineer or
         | anything even slightly related), but that's the general gist of
         | it.
        
         | Joeboy wrote:
         | I think he's kind of wrong. As you say, anything going through
         | any kind of professional audio editing software is probably
         | 32/64 bit float. AFAIK all audio plugin standards work on 32/64
         | bit floats.
         | 
         | Although I imagine at least historically that's more because 32
         | bit floats are a native data type.
        
       | epx wrote:
       | I miss the simplicity of OSS :\
        
         | OsrsNeedsf2P wrote:
         | Hardware gets more nuanced and Linux needs to accommodate it.
         | Otherwise we'd be stuck with blurry fonts and no UI scaling
         | like it's 2014
        
           | akira2501 wrote:
           | Consumer grade audio hardware has not gotten any more
           | "nuanced" for several decades now. For the vast majority of
           | use cases OSS was perfectly fine and it offered more than
           | enough API to handle new features.
           | 
           | For the small minority of uses cases where you might have two
           | sound cards and you may want to do some kind of sample
           | accurate combined production between the two at very low
           | latencies, sure, OSS was _somewhat_ inadequate.
           | 
           | So we ended up with a giant complicated audio stack where the
           | boundaries between kernel space and user space are horribly
           | blurred and create insane amounts of confusion and lost hours
           | to benefit the 1% of users who might actually use those
           | features.
           | 
           | It was a complete mistake.
        
             | vetinari wrote:
             | The OSS was inadequate the same day, when it was
             | introduced; it couldn't even handle hardware available at
             | the time (GUS, for example). It was really just mapping of
             | the Soundblaster to a device file. For a single process, of
             | course, all the others would have to wait, mute -- for
             | mixing multiple inputs, you would need that dreaded daemon.
             | Or GUS-like hardware, but with enough channels, so that yo
             | won't run out of them. But then, mixing them in CPU is more
             | effective, than pushing them all over external bus.
             | 
             | In a modern computer, you might have more sound cards than
             | you are aware; the onboard sound codec, the outputs on your
             | graphic card (that thing that pushes sound over DP/HDMI is
             | a separate "sound card"), you might have some usb device
             | (soundbars on monitors are usually usb sound devices),
             | heck, even microphones from the last two decades have their
             | own output. Webcam? Another sound device. Gamepad? That one
             | too. And that's before anyone connects anything bluetooth.
             | So it is not a small minority, in fact, it is the vast
             | majority.
             | 
             | Audio stack boundary is in user space; period. It does
             | stuff, that doesn't belong to kernel and is a perfect
             | candidate for a daemon.
        
         | gnramires wrote:
         | I use Void Linux, and find it reasonably simple :) (the reason
         | I like the distro essentially)
         | 
         | Nothing against complex things, if that's your thing though.
         | (usually complex things are made to be 'easier'/more convenient
         | to operate too, for some definition of easier)
        
           | ssl-3 wrote:
           | I think they meant OSS (Open Sound System), not OSS (Open
           | Source Software). In the Linux space, OSS predates ALSA.
           | 
           | (Back in the OSS days, we tended to use the term "free
           | software" or even "copyleft" more than we did "OSS" to
           | describe software licensing.)
        
         | miffe wrote:
         | Yeah, IMHO the best audio linux has ever was with OSS v3 and a
         | soundcard that did hardware mixing. No software mixers like ESD
         | or ARtS were needed.
        
           | PaulDavisThe1st wrote:
           | There have been no cards that can do hardware mixing under
           | production for more than 15 years. This is delusional.
           | 
           | Also, the cards that could do that back in the day were,
           | audio quality speaking, shite.
           | 
           | If that's really what you consider "the best audio linux has
           | ever", I think you don't know audio on linux very well.
           | 
           | I will grant you one thing: if you did have one of those
           | cards, it certainly made multiple applications all playing
           | (same sample rate) audio at the same time as easy as it could
           | be. But that's all.
        
         | self_awareness wrote:
         | I think OSS is still a default sound framework on FreeBSD?
        
       | lofaszvanitt wrote:
       | Well, the most confusing part of linux is definitely the audio
       | stack. Thanks for the writeup.
        
       | Venn1 wrote:
       | No mention of AoIP. I make heavy use of Netjack2 in my production
       | / streaming studio. Great way to move 25/30 channels of audio
       | between 5 PCs in real-time.
       | 
       | Beats the pants off DANTE.
        
         | jauntywundrkind wrote:
         | PipeWire is starting to get AES67 support, which seems to be
         | the audio and/or video streaming standard the industry is
         | rallying around. PTPv2 vs DANTE's PTPv1, and just a much
         | clearer protocol. I'm so excited for it!
         | https://gitlab.freedesktop.org/pipewire/pipewire/-/wikis/AES...
         | 
         | There's a bunch of neat hardware listed in a ticket thread that
         | folks have been playing with. Bluetooth to AES67 adapters,
         | analog to AES67, whole huge video wall streamers.
         | https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/32...
        
       ___________________________________________________________________
       (page generated 2024-07-26 23:05 UTC)