[HN Gopher] Fedora 38 LLVM vs. Team Fortress 2
___________________________________________________________________
Fedora 38 LLVM vs. Team Fortress 2
Author : st_goliath
Score : 88 points
Date : 2023-04-24 18:54 UTC (4 hours ago)
(HTM) web link (airlied.blogspot.com)
(TXT) w3m dump (airlied.blogspot.com)
| DannyBee wrote:
| Fedora 38 includes the LLVM15 libs to maintain backwards
| compatibility.
|
| Why is this automatically using a new, incompatible solib,
| instead of a versioned solib?
| AnssiH wrote:
| The LLVM dependency is in the HW-specific driver solib which is
| loaded by the OpenGL library, which has the same soname as
| before.
| IceWreck wrote:
| Isn't this the reason why people recommend using the flatpak
| version of Steam ?
| PlutoIsAPlanet wrote:
| Yes, especially on Fedora.
|
| This isn't something Fedora is doing wrong, unfortunately some
| games build against older libraries or are built against
| Debian/Ubuntu and the Flatpak runtimes generally have better
| compatibility.
| exabrial wrote:
| I thought TF2 was pretty much 100% hacked... like no legit non-
| hackers playing except at LAN parties.
| sosodev wrote:
| It's unfortunate but the Steam experience on Linux seems to be
| progressively getting worse (outside of Steam Deck ofc). The
| Steam client is often borderline unusable for Linux users. You
| can find many issue threads on GitHub reporting client freezes
| and crashes.
|
| It seems like a big part of the issues is a lack of maintenance.
| TF2 would actually run better on Linux via Proton but VAC isn't
| enabled so you can't join the vast majority of servers.
|
| Valve also has existing Source engine tooling that allows Linux
| ports to drop OpenGL entirely (dxvk-native as used by Portal 2
| and L4D2) but they haven't added it to TF2... :(
| ho_schi wrote:
| I can't see how native Linux support is getting worse. Linux
| users are good at bug reporting. Maybe some developers should
| care more about compatibility. And yes, especially the
| heterogeneous setups used by some makes support difficult.
|
| I'm worried that Valve puts too much resources into Proton
| (derivate of WINE) instead of tooling for native ports. Yes,
| Proton is needed to provide initial compatibility. But Proton
| is another layer of complexity (more bugs, integration, system
| resources) which requires more programming. I started playing
| CS again after it was ported natively in 2014, it runs well and
| all issues with WINE were gone.
|
| If Proton becomes to "good" we end up in a situation with a
| high maintenance burden for Valve. Game developers will rely on
| it and Valve has all the constant work. Instead game developers
| should treat Linux as first-class platform for AAA-Titles, for
| which the need appropriate APIs, compatibility and tooling. As
| Valve does itself support Linux as first-class platform from
| HL2 to CSGO. The target shall be official support from the very
| first day.
|
| Anyway. Looks like Valve has chosen a special implementation
| for TF2? What I miss here is a link to a bug report. Ideally
| opened months ago :)
| danbolt wrote:
| I think Valve has a financial incentive to keep Proton
| compatibility in a positive state, as it increases sales of
| the Steam Deck and encourages players to remain in their
| ecosystem. Or, I think it's more likely than the majority of
| AAA game developers having a financial incentive to maintain
| Linux versions of their products.
| pjmlp wrote:
| Game studios already know Linux distributions quite well on
| the server, and most AAA games on Android are basically only
| using the NDK, meaning ISO C and C++, OpenGL ES, Vulkan,
| OpenSL.
|
| Besides that, PlayStation OS is based on FreeBSD. Even if the
| 3D API is different, it is just yet another backend.
|
| They don't port them, because the QA and support aren't worth
| the sales, that is about it.
| zamalek wrote:
| > You can find many issue threads on GitHub reporting client
| freezes and crashes.
|
| The fact that these are happening does not necessarily mean the
| client is getting worse. For example, it _could_ mean that more
| people are installing Steam for Linux. There is no baseline to
| say it 's getting worse, because nobody opens an issue saying
| "all working here."
|
| In my experience, the _only_ issue I have on Wayland is this:
| https://github.com/ValveSoftware/steam-for-linux/issues/7245
| (workaround: disable animated avatars) (edit: all AMD machine)
|
| > outside of Steam Deck ofc
|
| There is nothing special about the Steam Deck. It's just
| another Linux machine.
|
| > TF2
|
| I don't play any Source games, but I could see TF2 having
| issues because it's in maintenance mode. If it is bjorked that
| has nothing to do with Steam.
| sosodev wrote:
| True, I don't have enough data to really make that claim. I
| can say that my own hardware hasn't changed in ~4 years and
| I've been using Steam for Linux since I built this machine.
| It's only within the last year or so that I started having
| major issues with the client.
|
| > there is nothing special about the steam deck
|
| How is first party support for the hardware and software
| stack "nothing special"?
|
| > If it is bjorked that has nothing to do with Steam.
|
| Maybe it's not directly related to the rest of my comment but
| it's related to the OP. I also think it's indicative of
| Valve's issues with Linux.
| zamalek wrote:
| > How is first party support for the hardware and software
| stack "nothing special"?
|
| Because the vast majority of that stack (the kernel, GPU
| driver, window manager, and so forth) has nothing to do
| with Valve. They might contribute drivers to the kernel
| (I'm not sure if they actually do - I would expect AMD to
| be doing that), but otherwise it's an Arch-based distro
| with the same Steam client and Proton runtime that everyone
| else is using.
| sosodev wrote:
| Yes, the Steam Deck is using a fairly standard stack and
| the default client but you're missing the point. Valve
| directly tests the Steam Deck and prioritizes bug fixes
| for it. When users report issues with other setups it
| often takes months for the identified bug to be fixed if
| it ever is.
| mariusor wrote:
| > There is nothing special about the Steam Deck. It's just
| another Linux machine.
|
| That's not true. It's a read-only linux on a fixed hardware
| platform, which is a vaaastly different beast than the myriad
| of hardware/software combinations that exist out there in the
| wild.
| zamalek wrote:
| > It's a read-only linux on a fixed hardware platform
|
| I have heard that argument about macOS a lot, and this is
| nothing like that. There isn't some "special sauce er...
| Source" that they apply to their platform. It's just GPL
| Linux. They may have avoided bad decisions like relying on
| NVIDIA for Linux gaming, but that's hardly the level of
| ownership that you see with other vertical integrations. If
| I use an AMD CPU (or Intel, which would be arguably better)
| and AMD GPU, there is no reason why my PC couldn't be just
| as "first-party" as the Steam Deck.
|
| Wine/Proton ultimately access the GPU through DRM, that
| remains the same for Valve hardware or custom-built
| hardware. Both Steam and Wine/Proton currently render via
| X11 (via XWayland if necessary), on both my PC and the
| Steam Deck.
|
| I feel like there is a gap of understanding how a HAL works
| here.
| mlyle wrote:
| > It's just GPL Linux.
|
| "Just" GPL Linux encompasses myriad library versions,
| kernel versions, driver versions and varied hardware.
|
| > I feel like there is a gap of understanding how a HAL
| works here.
|
| Just because you have a HAL doesn't mean that you don't
| get different behavior and crashes with different numbers
| of CPUs/concurrency or other hardware beneath. Modern
| GPUs are also pretty complicated beasts, and assuming
| that's fully abstracted is a mistake.
|
| And this all leaves aside the myriad of other problems
| you can have with the ensemble of software running on the
| machine that interacts with the game (directly or
| indirectly).
|
| Being able to test and make one restricted platform work
| well is a far different beast than covering the huge mass
| of variation users create on their own machines.
| admax88qqq wrote:
| I feel like there is a gap in understanding of how
| commercial software deployment goes.
|
| When you have a platform like Steam Deck, it's the
| platform that gets tested by QA, and the platform that
| most of your devs are building for every day.
| mariusor wrote:
| Sure, a linux machine is made out of just a CPU and a
| GPU. Even if that would be the case, what about the
| software combinations that can exist and that the
| SteamDeck simplifies?
|
| In the gamedev world I heard a lot of people not wanting
| to support linux because they never know which glibc
| version to support, which mesa version to support, which
| hardware GPUs to support, which graphical API to support,
| etc.
|
| Cutting down that matrix (and I just mentioned the most
| egregious examples) to only one element is invaluable in
| ensuring your users have a bug free experience.
| Arnavion wrote:
| I run Steam in a Docker container of Ubuntu 22.04 for reasons
| like this. Also my actual system isn't polluted with 32-bit
| libs, Steam can't rm-rf my home directory and games can't steal
| files from my home directory (homedir inside the container is a
| separate directory on the host), and access to X and dbus is
| restricted (dbus socket not forwarded, X socket is from a
| nested Xephyr instance) so nothing can be stolen from there
| either.
|
| Edit: More details in
| https://news.ycombinator.com/item?id=34634854
| sosodev wrote:
| Is there a guide for this? I'd really like to isolate Steam
| from the rest of my system
| [deleted]
| Entinel wrote:
| Devil's advocate, I use Steam on Fedora and have had 0 issues.
| Very rarely freezes or crashes. It's probably the most stable
| application I use daily.
| 2OEH8eoCRo0 wrote:
| I use Steam on Fedora as well and I notice a lot of jank with
| the Steam client (Nvidia 1080ti). Dropdown menus popping
| through windows, sound may or may not work for videos,
| freezing, etc. It's usable but it's not very pleasant.
| Entinel wrote:
| Just for GPU comparison I'm on an AMD RX card so it could
| be an Nvidia issue which is known to be jank on Linux.
| sosodev wrote:
| It seems to work perfectly for some people. I've regularly
| had issues with the client not rendering at all, freezing,
| and crashing on Pop_OS 22.04 LTS with an nvidia GTX 1660ti.
| WaffleIronMaker wrote:
| I also use Steam on Fedora, and I've not had any issues with
| Stardew Valley, Factorio, Celeste, N++, Undertale, and
| others. I remember having a brief issue with Portal, but I
| was able to resolve it. Overall, I've had a good experience.
| amluto wrote:
| It should be straightforward to make a little LD_PRELOAD shim to
| implement the new operator new on top of old overloads and thus
| restore proper functioning.
|
| It would be a gross kludge, though.
| olliej wrote:
| I'm not sure that's sound. You can't just redirect an aligned
| new to the unaligned operator new as you may get unaligned
| result. It _sounds_ like what is happening is
| a = ::operator new(some size, some alignment) ...
| ::operator delete(a);
|
| where delete is dropping the align_val_t parameter that would
| guarantee it hits the same allocator family. There are a
| variety of ways this can happen, and let's just take it as
| given that it is.
|
| The problem is that if operator new(size_t, align_val_t) is
| called then the struct has an alignment annotation. That can
| lead to codegen that reasonably assumes alignment, even without
| any source level decisions that depend on alignment. The result
| of having some equivalent of (either at runtime or link time)
| void * operator new(size_t sz, align_val_t a) { if
| (operator new(size_t) has been overridden) return ::operator
| new(sz); ... }
|
| could be an "aligned" allocation returning an unaligned value,
| causing crashes later on.
| viraptor wrote:
| If you don't mind wasting a bit of time, you could forward
| size+alignment to the allocator, return the aligned version
| and keep a record of aligned-to-allocation mapping. (For
| freeing later)
|
| But as the other comment mentioned - it should be a problem
| for tf2 in the first place since that's not the behaviour
| they're after.
| olliej wrote:
| > If you don't mind wasting a bit of time, you could
| forward size+alignment to the allocator, return the aligned
| version and keep a record of aligned-to-allocation mapping.
| (For freeing later)
|
| I'm unsure what you're proposing here - the only methods
| you know in the replacement allocator are operator
| new(size_t) and operator delete(void _). The two possible
| failure paths are: a = ::operator
| new(some size) ... ::operator delete(a,
| alignment)
|
| and a = ::operator new(some size, some
| alignment) ... ::operator delete(a)
|
| In the first case what you could do is say "if I did not
| allocator this pointer, optimistically forward it to
| operator delete(void_ )", in the latter case you can
| identify that a different operator new(size_t) exists but
| you have no idea how to make that allocator produce an
| aligned allocation. What I guess you could do is round the
| size up to a multiple of the specified alignment, and then
| just repeatedly allocate in the hope that you will
| eventually get a correctly aligned value out. But that
| would not be guaranteed.
| nneonneo wrote:
| The latter suggestion assumes that there's enough entropy
| in the allocation process to make this work. But that's
| not guaranteed! Suppose that your allocator doesn't pad
| allocations (e.g. because it uses a bitmap), and that it
| only guarantees 0x10 alignment. If the top of the heap
| happens to be unaligned with respect to your desired
| alignment (e.g. address ends in 0x10 when you want 0x20
| alignment), you might wind up just repeatedly allocating
| unaligned blocks off the top of the heap forever.
|
| This is not an easy problem to solve, unfortunately. On
| MacOS I believe they solve this problem using the two-
| level namespace: symbol references include the library
| name, so "operator new(size_t)" from libstdc++ is
| distinct from "operator new(size_t)" from libtcmalloc.
|
| Symbol versioning also seems like it should solve the
| problem: have the new interfaces explicitly declared with
| a newer ABI version (e.g. @@LIBCXX_17) and link only to
| those new versions from code that expects them. Of
| course, symbol versioning comes with its own set of nasty
| drawbacks, but in this case it seems like a solution that
| might work?
| olliej wrote:
| > The latter suggestion assumes that there's enough
| entropy in the allocation process to make this work. But
| that's not guaranteed!
|
| Oh absolutely, there's no guarantee it's ever aligned:
| the allocator could wrap an aligned allocator but include
| a pointer sized prefix (a la array allocations) so you
| would be _guaranteed_ to never be more than pointer size
| aligned :D
|
| As you say versioning and namespacing is super
| problematic, but I'm not sure they'd even work here.
|
| At it's core the problem is that some code is compiling
| with the knowledge it has aligned allocations, so can
| assume alignment, and the some parts are not. There are a
| bunch of options that ensure that the allocator is
| consistent, but they devolve to either ignoring the
| new+delete overrides, or having the aligned allocators
| detect the override and forward to unaligned allocators
| while hoping nothing depended on correct alignment.
| viraptor wrote:
| > and then just repeatedly allocate in the hope that you
| will eventually get a correctly aligned value out
|
| If you preload something that patches all the new/delete
| interfaces, you can do this without guesswork.
| new(size, alignment) ->
| res=alloc(size+alignment) res_aligned=res+...
| offsets[res_aligned] = res new(size) ->
| alloc(size) free(ptr) ->
| free(offsets[ptr] || ptr) offsets.del(ptr)
| amluto wrote:
| See my comment above. tcmalloc implements the _C_ API as
| well, including aligned_alloc().
| jenadine wrote:
| That's not sound in general, but it is "probably" going to
| work for this specific case because the previous build was
| build with allocator that did not support this alignment,
| meaning that they did not need extra alignment. This is
| pretty rare actually. And you had anyway to use a custom
| allocator already with previous C++ versions to make it work.
| olliej wrote:
| While I do agree with you, and think it's probably worth
| seeing if detecting the override and falling back to
| unaligned allocation works, the problem is not that the
| code in TF is compiling assuming/requiring over aligned
| data.
|
| The problem is that there is system code that they are
| calling that is making using of over aligned allocation, so
| therefore could be generating code dependent on said
| alignment. The failure mode can very easily be
| someSystemLibrary.so`someFunction: alignedThing =
| ::operator new(size, alignment) ...
| i_dunno_dma_memcpy_or_something(a, somewhere else)
| ... ::operator delete(a)
|
| With no interaction with TF code at all. _Except_ TF has
| replaced operator delete so that fails due to the allocator
| mismatch. If you make ::opeator new(size_t, align_val_t)
| redirect to ::operator new(size_t) if it detects an
| override then the aligned operation can fail. The above
| example is moderately difficult to induce so it 's more
| likely that there's an explicit split with the system is
| doing one half of new/delete and TF is doing the other, but
| the important thing is that it implies the system code is
| built aware of alignment and it depends on the alignment
| even if TF does not.
| Asooka wrote:
| The C interface for aligned memory allocation is
| aligned_alloc(). The returned pointers are always freed with
| free(). So what is probably happening is that aligned new
| calls aligned_alloc(), and then aligned delete simply calls
| the regular delete, expecting to end up in free(), which by
| design should work with both kinds of pointers.
|
| I _think_ the problem here is partly with the implementation
| of aligned new /delete. Since one is free to override only
| the old versions, the ones supplied by the standard library
| should make sure not to fall back to functions that may be
| partially overriden.
| amluto wrote:
| As pure speculation, one could forward to aligned_alloc and
| still free with ::delete. I haven't tested this, nor have I
| looked at the code.
| zokier wrote:
| LD_PRELOAD would probably run afoul with VAC though?
| Polycryptus wrote:
| Steam on Linux already uses LD_PRELOAD under-the-hood to load
| the overlay. Valve signs the overlay SO files, so they could
| be making an exception for Valve-signed-preloads in VAC, but
| it's also possible that VAC does something else to check for
| suspicious libraries loaded in.
| mmh0000 wrote:
| I loved the premise of the article, though I really wish the
| author had gone into detail about how he discovered the root
| cause.
| olliej wrote:
| This is a predictable outcome of overriding the global operator
| new. It remains annoying that this was ever allowed, and is a
| constant source of pain for c++ standard library implementations.
| phkahler wrote:
| It seems more like the app and driver are mixing their
| new/delete pairs. That seems like a bug to me. Maybe even an
| API design issue if it's supposed to happen.
| DannyBee wrote:
| It actually should still work, since fedora38 includes the
| llvm15 versioned libs.
|
| The only way to make this break is if something is loading
| random unversioned solibs or whatever the latest one it can
| find is, and expecting this to work forever.
|
| If it actually used a versioned solib, it would get llvm 15
| just like it did before.
|
| This is the whole point of versioned solibs.
| Karliss wrote:
| Whole graphics drivers using LLVM in the backend has caused
| countless issues. The way I look at it one of the main problems
| is that graphic API libraries shouldn't leak symbols from
| implementation details like them using LLVM. They should expose
| only the graphics API and nothing more.
| vchuravy wrote:
| Don't ask me about GNU_UNIQUE...
|
| Due to some wonderful C++ features the dynamic linker is forced
| to unify symbols across shared libraries, even if those symbols
| have different versions.
|
| This utterly breaks loading multiple libLLVM's except if you
| build the copy you care about with -no-gnu-unique (or whatever
| the flag was called)
|
| I have seen wonderful things like the initializers of an
| already loaded libLLVM being rerun when a new one is loaded.
| admax88qqq wrote:
| Unfortunately this is exactly the type of stuff that makes
| supporting commercial apps on linux a nightmare. Weird crashes
| due to weird linking of system libraries.
|
| Common distros are very adamant about dynamic linking everything
| in order to support the use case of "core library has
| vulnerability, upgrade it in place without rebuilding consuming
| apps." Along with a desire to avoid "dll hell" and force a single
| canonical version of every library systemwide. This leads to
| these sorts of issues.
|
| Windows gets around it by letting applications put the DLLs they
| care about beside the executable, and having it check there first
| by default.
| eikenberry wrote:
| Isn't this exactly the use case for which flatpaks are
| designed? Isn't Redhat/Fedora in the process of adopting them
| as the primary way to support third party/proprietary graphical
| apps like Steam? Doesn't the current Steam flatpak avoid this
| issue?
|
| TLDR; isn't this already addressed?
| doublepg23 wrote:
| The funny thing is in on Fedora in 2023 I don't feel like I'm
| missing out on most software.
| ho_schi wrote:
| Aehm. That is what a lot of closed-source applications do on
| Linux. And Valve does that, too.
|
| The open-source ones are maintained in the packing system and
| kept lean.
| gabcoh wrote:
| Can Linux not trivially do the same thing as windows with
| LD_PRELOAD? If so why is this more of an issue on Linux than
| Windows? Is it really less a technical challenge and more just
| a matter of Linux getting less support from upstream
| developers?
| bravetraveler wrote:
| I was thinking/wondering this myself. Not to reinvent the
| wheel - more toss an idea around, but a _' venv for
| LD_PRELOAD'_ sounds like it'd deal with this pretty handily
|
| Not... in a way I'd use as a distribution/release maintainer.
| _Probably_ as an administrator [of my LAN]
| gabcoh wrote:
| Such things already exist. Eg. Appimage or even docker.
| lnxg33k1 wrote:
| and even that has been managed to be split between snap
| appimage and flatpak :D
|
| (sorry not meant to offend, long time linux day-to-day
| user here, but it was just ironic for me to point out
| fragmentation of fragmentation ^^)
| bravetraveler wrote:
| Right, but I don't really want to get into a distribution
| model - the hack suits me fine :)
|
| More an exercise in curiosity than anything
|
| Flatpak (or Snap, ew) probably deals with it fine today,
| Steam's there
| stabbles wrote:
| LD_PRELOAD is too global to be useful, it's hard to scope it
| to one process (and not child processes). macOS is better in
| the sense that it clears DYLD_* variables when the dynamic
| linker has done its work and the process starts. (Although
| that can also be painful when you want to run a shell script
| and set DYLD_* outside)
| nly wrote:
| You can compile binaries with additional relative library
| paths in to them that will take priority over /usr/lib64
| aidenn0 wrote:
| This sounds like it's an interaction with the GPU driver
| though, which could also happen on windows...
| stryan wrote:
| Valve does this for a couple of their games, see a similar issue
| with Dota 2[0].
|
| [0] https://github.com/ValveSoftware/Dota-2/issues/2285
___________________________________________________________________
(page generated 2023-04-24 23:00 UTC)