[HN Gopher] The Linux audio stack demystified
___________________________________________________________________
The Linux audio stack demystified
Author : ruffyx64
Score : 93 points
Date : 2024-07-22 14:58 UTC (4 days ago)
(HTM) web link (blog.rtrace.io)
(TXT) w3m dump (blog.rtrace.io)
| ruffyx64 wrote:
| Wrote this blog article as I needed to get a better understanding
| of the audio stack on Linux (esp. PipeWire, PulseAudio, ALSA,
| etc. ...). The article turned out to be a lenghty in-depth
| explanation of how audio works, how digital audio works, and what
| sound servers on linux actually do. Tried to write it in a way so
| it is accessible and understandable for beginners but also
| enlightening for experienced users. Hope it's helpful to HN
| brudgers wrote:
| My experience:
|
| I'm interested in how Linux Audio works. The first half of the
| article covers other topics. It could be a separate article. An
| article focused on Linux Audio could say "For audio basics,
| click this link to my article on Audio Basics."
|
| Even for beginners, that's useful because even beginners just
| want to get sound out of their speakers and anatomy and physics
| lessons are in the way. It's ok to start with ALSA. There's no
| need to boil the ocean.
| chung8123 wrote:
| I really appreciate blogs/articles like this. It really helps
| me get beyond the surface on things and I always learn
| something. Thanks for taking the time to share.
| Modified3019 wrote:
| Learning by trying to teach is probably the best way to clarify
| and crystallize what we think we know. Always appreciate these
| kind of posts, especially since they tend to shine a line on
| all the contextual bullshit that experts take for granted.
|
| Right now I'm doing the same for an identification/contextual
| guide of local weeds and insects for seasonal scouts (I'm an
| agronomist). Unfortunately I find complexity tends to quickly
| become fractal and highly interlinked and it's hard to set an
| entry point or tell when to limit scope.
|
| I think you've done a great job of doing just that.
| anvuong wrote:
| Thanks for the nice writing. But do you have any insight on why
| is bluetooth audio so clunky on Linux? I'm using a pair of Sony
| XM4 and I have never had any problems on my 4 Windows machines.
| But on Ubuntu (both 22.04 and 24.04), I have had to jump through
| many hoops, from editing a bunch of config files, changing kernel
| flags, disable and enable a bunch of things I don't understand
| (mostly from reading Arch Wiki), just to get it working _some_ of
| the times. Some days it will just outright refuse to connect,
| sometime it connects but not playing anything (switching audio
| device to it generates some undecipherable error logs), and
| (probably worst) sometime it connects very quickly but stay
| locked in low fidelity mode instead of a2dp sink. I 'm so fed up
| that I just switched to wired headphones every time I use my
| Ubuntu.
| cogman10 wrote:
| It's so clunky, IMO, because bluetooth is a dumbass protocol
| with things in the standard that should not be there (including
| which audio codecs are supported with which levels of
| bluetooth). Rather than just being a more simple network of
| wireless devices, it's a very complex protocol which makes
| everything more complicated.
|
| Why you may struggled could be anything from the firmware blob
| for your bluetooth device, to the kernel driver installed, to
| bluez, to the sound server you are using. Any one of those
| things messing up will lead to a bad experience.
|
| I've had a relatively good experience with kde-plasma's
| bluetooth management stuff. But I still have to do dumb things
| like manually selecting which audio codec to use when I go on a
| call.
|
| How could bluetooth be better? It should be at least 2
| standards. 1 defining the wireless data transfer and network
| capabilities, a second which defines how a computer negotiates
| with a device to send audio. It shouldn't be 2 standards merged
| together like it currently is. Wifi Direct is more what
| bluetooth should be.
| disinterred wrote:
| I use arch linux and have never had an issue with pairing
| bluetooth with anything. In fact, imho, it works much smoother
| than Windows because I keybind bluetoothctl to connect to any
| bluetooth headphones, speakers, keyboard or whatever
| automatically using their bluetooth device IDs. To do this you
| must first pair them (I use the blueman-manager gui) and then
| get their bluetooth device ids and keybind the bluetoothctl
| command. All of this is easy to do by asking ChatGPT. Hope this
| helps.
| ssl-3 wrote:
| I've never done much with Bluetooth under desktop Linux, but
| that sounds like a woeful pain in the ass compared to the
| usual steps for Android or Windows:
|
| 1. Pair headphones in a couple of clicks/taps; sound comes
| out.
| jauntywundrkind wrote:
| You can just pair as usual, yes, like any other OS, via a
| similar gui. And the device will then reconnect in the
| future.
|
| What the parent is describing is an advanced flow, that can
| be helpful if you have lots of computers & need to juggle
| bt devices.
|
| Setting up a hotkey just takes _pre-work_ to setup. _This
| workflow is optional._ But it saves time & effort _if_ for
| some reason you are one of the very few users who moves
| devices around a lot.
| Izkata wrote:
| A hotkey is more work than GP is describing. Pairing is a
| one-time thing, after that they connect automatically
| when the headphones are on and nearby.
|
| ...which, also, is exactly what mine do with Ubuntu. I
| used bluetoothctl to pair them once when I first got
| them, and when I turn them on Ubuntu automatically
| connects and switches the audio over. I don't have the
| same model headphones as GGGP, so I'm guessing it's a
| problem specifically with that model's implementation
| (Edit: or from another person who has the same model and
| no issues, perhaps some combination of hardware/software
| specific to that user).
| jauntywundrkind wrote:
| I think we're actually somewhat in alignment, but when
| you say
|
| > _Pairing is a one-time thing,_
|
| You ignore the two scenarios I face regularly, that stem
| from me having lots of devices and lots of computers &
| wanting to switch around what's paired to what.
|
| We both seem to be trying to defeat the notion that using
| Bluetooth in Linux is hard or special (it's not at all,
| it works like anywhere else, and these reports of it
| being hard are from people with _at best_ extremely small
| domains of experience & knowledge).
|
| I was trying to add that Linux has further upsides for
| when you do want to go further, and highlight & interpret
| the parent post to show how I have those issues &
| describe how adding hotkeys (something only Linux does)
| would help me, an advanced user juggling many systems &
| device. I've clarified my post to mention that auto-
| reconnecting will just work on most scenarios (but I get
| why some folks might think it's cool to have hotkeys).
| LtWorf wrote:
| Yes the couple of clicks is the pairing. You have to pair.
| ssl-3 wrote:
| Then this keybinding and device ID management business
| accomplishes what, exactly, other than exercising extra
| steps?
| jpeloquin wrote:
| I also have XM4's and they worked fine on Arch after addressing
| two problems:
|
| Do you dual boot? Different OS's on the same computer will
| generate different pairing keys even though they share the same
| MAC, and this will cause connection issues. Usually that's
| reported as having to re-pair every time you switch OS's
| though.
|
| https://unix.stackexchange.com/questions/255509/bluetooth-pa...
|
| I've also experienced audio skipping & popping using a dual
| WiFi/Bluetooth card that were eliminated by disabling WiFi.
| Apparently the Linux driver was faulty and allowed some
| interference; the card worked fine on Windows.
| pdw wrote:
| Debian does not ship the AAC codec, due to legal quagmire
| surrounding the necessary code. The same probably goes for
| Ubuntu. That might be the cause of at least some of your
| problems. https://tookmund.com/2024/02/aac-and-debian
| self_awareness wrote:
| > I'm using a pair of Sony XM4 and I have never had any
| problems on my 4 > Windows machines. But on Ubuntu (both 22.04
| and 24.04), I have had to > jump through many hoops [...]
|
| I also have XM4's (best headphones in my life; seriously,
| they've saved my sanity and lowered my stress levels, more than
| a few times), but I never had any problems with BT pairing. I
| use them with my phone, Ubuntu, OpenSUSE, ArchLinux and macOS,
| although not Windows, and they always pair up perfectly fine. I
| have two-device mode activated at all times.
|
| My SO uses them (she has her own XM4's) with Windows and her
| phone, and also never had any problems.
|
| Maybe it's a hardware issue?
| LtWorf wrote:
| I have no issues with bluetooth. Just click on the device,
| associate and then it works. After the 1st time just being on
| is enough.
| ladzoppelin wrote:
| "Professional audio will typicall utilize 24-bit. Everything
| higher than that is usually bogus. Bogus where only audiophiles
| will hear a difference." Does he mean internal DAW bit rates like
| 64/32bit float are bogus, I am probably reading it wrong ?
| swatcoder wrote:
| I read them as talking about _listening_ , as represented in
| mentioning audiophiles.
|
| The extra depth/range available in DAW's are useful for effects
| processing, mixing, and mastering and are a little colored by
| trying to squeeze max-performance DSP on a general-
| purpose/commodity CPU. I just don't take them as talking about
| that here though.
| tialaramex wrote:
| And the bits are basically free. If we had very cheap 24-bit
| floats and nothing bigger, maybe we'd use those, but we've
| got cheap 32-bit floats, so those are fine.
|
| The most important property of floating point is "infinite
| headroom". In integer space, sixteen times quieter means 4
| fewer bits of audio, get the levels wrong badly enough and
| people can hear your mistake even if you fix it later - but
| in float space it barely makes any difference, so long as the
| levels are correct in the final consumed audio nobody cares.
| creeble wrote:
| "16 times quieter" is not 4 bits.
|
| "Half volume" is subjective, and for music is typically
| between 6 and 10dB (most US audio engineering classes use
| 10dB).
| PaulDavisThe1st wrote:
| We would NOT use 24 bit floats since that would make them
| less than ideal at matching the hypothetical (and almost
| certainly never reach) 24 bit resolution of integer DAC/ADC
| hardware.
|
| The reason why 32 bit floats work great is that they can
| handle a 24 bit integer without any loss, and then if for
| some reason the values get kicked up above the maximum you
| can represent there, you get subtle noise rather than heavy
| distortion.
| hatthew wrote:
| If you listen to an audio file at 24 bit vs 64 bits (bit depth,
| not bitrate), you won't notice a difference. However, if you're
| manipulating audio in a DAW or similar, it's possible for noise
| to end up amplified in the final output, so a higher bit depth
| could make a difference.
|
| Think of it this way: every time you add a filter or any type
| of audio manipulation in your DAW, you're discarding some
| information and replacing it with noise (how much depends on
| what manipulation you're doing, but it's almost always >0). If
| you start at 24 bits and then don't manipulate anything, it's
| all good. But if you start at 24 bits and then lose 10 bits of
| the true signal, you're down to just 12 bits of information.
| But if you start at 64 bits, you can lose 40 bits before you
| start to notice anything (or really it depends quite a lot on
| many different factors, but in general there's a threshold
| where noise goes from "not noticeable" to "noticeable" and it's
| probably usually between 8 bits and 32 bits).
|
| Don't quote me on the details (I am not an audio engineer or
| anything even slightly related), but that's the general gist of
| it.
| Joeboy wrote:
| I think he's kind of wrong. As you say, anything going through
| any kind of professional audio editing software is probably
| 32/64 bit float. AFAIK all audio plugin standards work on 32/64
| bit floats.
|
| Although I imagine at least historically that's more because 32
| bit floats are a native data type.
| epx wrote:
| I miss the simplicity of OSS :\
| OsrsNeedsf2P wrote:
| Hardware gets more nuanced and Linux needs to accommodate it.
| Otherwise we'd be stuck with blurry fonts and no UI scaling
| like it's 2014
| akira2501 wrote:
| Consumer grade audio hardware has not gotten any more
| "nuanced" for several decades now. For the vast majority of
| use cases OSS was perfectly fine and it offered more than
| enough API to handle new features.
|
| For the small minority of uses cases where you might have two
| sound cards and you may want to do some kind of sample
| accurate combined production between the two at very low
| latencies, sure, OSS was _somewhat_ inadequate.
|
| So we ended up with a giant complicated audio stack where the
| boundaries between kernel space and user space are horribly
| blurred and create insane amounts of confusion and lost hours
| to benefit the 1% of users who might actually use those
| features.
|
| It was a complete mistake.
| vetinari wrote:
| The OSS was inadequate the same day, when it was
| introduced; it couldn't even handle hardware available at
| the time (GUS, for example). It was really just mapping of
| the Soundblaster to a device file. For a single process, of
| course, all the others would have to wait, mute -- for
| mixing multiple inputs, you would need that dreaded daemon.
| Or GUS-like hardware, but with enough channels, so that yo
| won't run out of them. But then, mixing them in CPU is more
| effective, than pushing them all over external bus.
|
| In a modern computer, you might have more sound cards than
| you are aware; the onboard sound codec, the outputs on your
| graphic card (that thing that pushes sound over DP/HDMI is
| a separate "sound card"), you might have some usb device
| (soundbars on monitors are usually usb sound devices),
| heck, even microphones from the last two decades have their
| own output. Webcam? Another sound device. Gamepad? That one
| too. And that's before anyone connects anything bluetooth.
| So it is not a small minority, in fact, it is the vast
| majority.
|
| Audio stack boundary is in user space; period. It does
| stuff, that doesn't belong to kernel and is a perfect
| candidate for a daemon.
| gnramires wrote:
| I use Void Linux, and find it reasonably simple :) (the reason
| I like the distro essentially)
|
| Nothing against complex things, if that's your thing though.
| (usually complex things are made to be 'easier'/more convenient
| to operate too, for some definition of easier)
| ssl-3 wrote:
| I think they meant OSS (Open Sound System), not OSS (Open
| Source Software). In the Linux space, OSS predates ALSA.
|
| (Back in the OSS days, we tended to use the term "free
| software" or even "copyleft" more than we did "OSS" to
| describe software licensing.)
| miffe wrote:
| Yeah, IMHO the best audio linux has ever was with OSS v3 and a
| soundcard that did hardware mixing. No software mixers like ESD
| or ARtS were needed.
| PaulDavisThe1st wrote:
| There have been no cards that can do hardware mixing under
| production for more than 15 years. This is delusional.
|
| Also, the cards that could do that back in the day were,
| audio quality speaking, shite.
|
| If that's really what you consider "the best audio linux has
| ever", I think you don't know audio on linux very well.
|
| I will grant you one thing: if you did have one of those
| cards, it certainly made multiple applications all playing
| (same sample rate) audio at the same time as easy as it could
| be. But that's all.
| self_awareness wrote:
| I think OSS is still a default sound framework on FreeBSD?
| lofaszvanitt wrote:
| Well, the most confusing part of linux is definitely the audio
| stack. Thanks for the writeup.
| Venn1 wrote:
| No mention of AoIP. I make heavy use of Netjack2 in my production
| / streaming studio. Great way to move 25/30 channels of audio
| between 5 PCs in real-time.
|
| Beats the pants off DANTE.
| jauntywundrkind wrote:
| PipeWire is starting to get AES67 support, which seems to be
| the audio and/or video streaming standard the industry is
| rallying around. PTPv2 vs DANTE's PTPv1, and just a much
| clearer protocol. I'm so excited for it!
| https://gitlab.freedesktop.org/pipewire/pipewire/-/wikis/AES...
|
| There's a bunch of neat hardware listed in a ticket thread that
| folks have been playing with. Bluetooth to AES67 adapters,
| analog to AES67, whole huge video wall streamers.
| https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/32...
___________________________________________________________________
(page generated 2024-07-26 23:05 UTC)