[HN Gopher] Setenv Is Not Thread Safe and C Doesn't Want to Fix It
___________________________________________________________________
Setenv Is Not Thread Safe and C Doesn't Want to Fix It
Author : r4um
Score : 188 points
Date : 2023-11-20 05:19 UTC (17 hours ago)
(HTM) web link (www.evanjones.ca)
(TXT) w3m dump (www.evanjones.ca)
| turtleyacht wrote:
| _Should_ apps orchestrate a super-global lock of a foreign
| namespace?
|
| An environment variable's value, for a running process, is just
| what it is: an initial value from outside.
|
| Adding complexity around it smells like an attempt to control a
| distributed mutex, like checking an API for real-time value
| changes in a while loop across several instances of the same app.
|
| I thought there would be alternatives to this, like pubsub,
| Kafka, or other asynchronous event handling.
|
| Imagine having to test an app for its ability to handle safe
| read-write of OS-level state. It's definitionally bankrupt: not
| really a unit, not easy to set up quickly, and not isolated.
| xxs wrote:
| >Imagine having to test an app for its ability to handle safe
| read-write of OS-level state. I
|
| Also it should be able to handle invariants as modifying
| multiple variables is not an atomic process, either.
| aidenn0 wrote:
| > An environment variable's value, for a running process, is
| just what it is: an initial value from outside.
|
| Then calling setenv() from anywhere except in the time between
| a fork() and exec() call should be banned, but it's not.
| Honestly, calling abort() if setenv is called in the presence
| of threads would seem like a better status-quo than what we
| have today.
| Khelavaster wrote:
| Slap a mutex on that beast!
|
| This is expected behavior for setting a GLOBAL variable without a
| lock on the memoryspace..
| krackers wrote:
| You can't guarantee that whatever libraries you pull in use the
| same mutex as you though.
| JaDogg wrote:
| who pull in libraries in C like this?
| agevag wrote:
| I'm not sure I understand the question.
|
| If you are writing, say, a GUI application, and use GTK,
| GTK may use getenv (without your mutexes, of course) at the
| same time you call setenv in another thread, potentially
| crashing your entire application if you are unlucky.
| Tobu wrote:
| Who doesn't? libc itself calls getenv when getting system
| time: https://news.ycombinator.com/item?id=38344224
|
| You may have a mutex on getenv/setenv, like the Rust stdlib
| does, but when libc doesn't look at that mutex, even on the
| read side, you run into UB.
|
| So the next step is never calling into seemingly innocent
| libc functions in safe code (which you have to enforce on
| your dependencies as well), implementing safe alternatives
| to a good chunk of libc (and making sure your dependencies
| use those), to cordon off anything that _looks_ at the
| environment. This makes a good chunk of POSIX functionality
| useless.
| JaDogg wrote:
| OK thank you for the explanation.. This makes more more
| scared to bring in libraries :(
| seeknotfind wrote:
| I thought they were talking about inside the libc
| implementation. Though, if that was done and people call
| getenv from async contexts, it could deadlock.
| pjmlp wrote:
| WG14 also doesn't want to provide safer string and array
| manipulation libraries for decades, even Dennis Ritchie failed to
| get fat pointers into ISO C, why should fixing this be any
| different?
| raverbashing wrote:
| Yeah, at this point the languages and OSs should (IMHO) be
| distancing them from the multiple craziness of the committee
| grandinj wrote:
| The right thing to do is to create a new thread safe api and
| implement that and then standardize it. Hard but very unsexy
| work.
| xxs wrote:
| The right thing is not to modify a global state once you have
| started all the threads. If you need to do that, use your own
| data structures. The case of getaddrinfo - would need a copy of
| the entire env. in a thread safe manner, then return the
| result. That would pretty much apply to anything that uses env
| saagarjha wrote:
| The person who fixed this would be incredibly sexy in my book.
| Though, that probably tells you more about my own sexiness than
| it invalidates your comment.
| xxs wrote:
| The sexy part is more like having an extremely
| brushed/retouched image in a magazine - it's not real.
| cryptonector wrote:
| It is fixed in Solaris/Illumos. glibc just needs to copy the
| approach.
| DangerousDoctor wrote:
| The essential problem is that there is no thread-safe way to
| implement this while maintaining backwards compatibility --
| applications can alter the environment block by changing the
| environ global pointer, applications can also alter the
| environment block by replacing individual pointers in the environ
| array, applications can also alter the environment block by
| altering the strings pointed to by the individual members of the
| environ array, applications can also alter the environment block
| by using setenv/putenv/etc.
|
| Inserting a mutex into the setenv/getenv/etc. functions is
| pointless because applications are explicitly allowed to modify
| the environ pointer and array directly without any locking.
| forrestthewoods wrote:
| Yup. It's a bad, broken, and unfixable API.
|
| The world really needs a new C standard library that doesn't
| suck.
| eviks wrote:
| Or transition to post-C
| xxs wrote:
| Java has System.getenv()... and it's an unmodifiable
| Map<String, String>. The real culprit it's the attempt to
| modify it, not C.
| rewmie wrote:
| > Yup. It's a bad, broken, and unfixable API.
|
| Is it, though?
|
| The only argument I see is that an API can be misused. There
| are ergonomics debates we can have about it, but a user
| intentionally abusing an API wanting to do something that's
| completely wrong is hardly indication the API is broken.
| forrestthewoods wrote:
| Exposing global mutable access to something and providing
| no thread safe version counts as broken in my book. Your
| book may differ.
|
| A good API would probably only allow constant access. If
| mutation is for some reason deemed necessary then it should
| be through a separate set API and the return results of get
| should be guaranteed safe.
| rewmie wrote:
| > Exposing global mutable access to something and
| providing no thread safe version counts as broken in my
| book. Your book may differ.
|
| Please show me from which book you got the idea that env
| variables are expected to change throughout the lifetime
| of a process.
| necovek wrote:
| It seems they are arguing that setenv should not exist in
| the first place: the fact it exists suggests it can and
| should be used, and thus not be a footgun.
| orwin wrote:
| > the fact it exists suggests it can and should be used
|
| I think most people argue about that. Just because it
| exists doesn't mean it should be used imho.
|
| I've used it exactly once, and that was a school exercise
| where I had to write a Posix shell (most of a posix shell
| actually), including built-ins. I do not see another use
| case tbh.
| jstimpfle wrote:
| > can and should be used, and thus not be a footgun
|
| It can and should be used in the cases where it makes
| sense, with the restrictions that are documented. It's an
| API that is fundamentally not thread-safe, you can not
| use it "safely" (in the modern sense of using it after a
| lobotomy, in any way that the compiler allows) in a
| multi-threaded context.
|
| There are other such APIs, and if those APIs were removed
| it would hurt a lot of old software that is running
| perfectly fine.
| jeroenhd wrote:
| > Please show me from which book you got the idea that
| env variables are expected to change throughout the
| lifetime of a process.
|
| POSIX specifies two functions that alter environment
| variables. It could've specified that env variables are
| supposed to be mapped into a read-only memory page where
| available to indicate that they shouldn't be altered, but
| it didn't, and instead provided an explicit read/write
| system.
| rewmie wrote:
| > POSIX specifies two functions that alter environment
| variables.
|
| The same POSIX spec you're citing also states in no
| ambiguous terms that setenv is not thread-safe.
|
| It's pointless to quote a section of a spec to try to
| justify failing to comply with the very same section of
| the very same spec.
| jeroenhd wrote:
| I'm not saying it doesn't. I'm just saying the spec
| indicates that environment variables can change during
| runtime.
|
| The spec is the problem in my opinion, you can't
| implement it in a way that doesn't introduce footguns.
| wredue wrote:
| Your book is wrong.
|
| As with most things programming related, "it depends".
|
| Functional programming has really, truly done untold and
| massive damage to the industry. Fortunately, you are free
| to unpoison your mind. I just hope people eventually do.
| jcelerier wrote:
| > Inserting a mutex into the setenv/getenv/etc. functions is
| pointless because applications are explicitly allowed to modify
| the environ pointer and array directly without any locking.
|
| by that logic mutex themselves are pointless because nothing
| ever forces you to use them, even in memory-safe languages you
| can still access /dev/mem and change bytes? It's stil a useful
| thing to have.
| PhilipRoman wrote:
| The difference is that modifying the environ pointer is
| explicitly supported behaviour in the standard, poking
| through /dev/mem is not.
|
| Although I guess a middle ground solution wouldn't be too bad
| either - most programs don't modify environ directly, so
| POSIX could offer thread safety for the functions and make
| multithreading through "environ" UB. This is already kind of
| explained in the standard:
|
| https://pubs.opengroup.org/onlinepubs/9699919799.2018edition.
| ..
| jcelerier wrote:
| > The difference is that modifying the environ pointer is
| explicitly supported behaviour in the standard
|
| they just have to fix the standard. e.g. in my country they
| manage to improve for instance the standard for electrical
| plugs every three years, there is NO REASON posix cannot do
| the same
| atoav wrote:
| Another problem is that it is hard to reason about security in
| a C program if all environment variables could change at all
| times.
|
| If we are talking about the C application deciding when it
| wants to rescan the environment that is something different,
| but if your environment can change potentially before and after
| you check it this opens you up for a heap of new attacks.
| jeroenhd wrote:
| I think the memory leak solution (copy over the env variables
| to a new location in memory every time you call setenv and keep
| the old pointers alive) will cause the fewest crashes.
|
| I would personally go for the aggressive approach (release a
| new major version of libc that detects multithreaded
| environments and intentionally crashes out when calling
| setenv() so people actually notice and fix their broken
| programs) but I suspect not many people will agree with me on
| that.
|
| The API is not necessarily bad (it's just very 80s UNIX), but
| the lack of enforcement of thread-safety causing all kinds of
| bugs and crashes.
| fch42 wrote:
| Leaking memory is not a "solution". Ever. Maybe for a
| commercial problem. But not for one in systems API design.
|
| If you copy, provide a new interface. It's time-honoured and
| proven in Unix to give *_r() ones in such a case.
| jeroenhd wrote:
| It solves hard crashes during DNS lookups by wasting a few
| kilobytes of RAM. Seems like a fine solution to me. The
| memory leak only occurs in circumstances where the program
| would've crashed or started messing with random memory
| anyway.
|
| A proper solution would be to either nuke put/setenv() in
| the C standard library or redesign the *env() calls
| entirely, but that would break existing programs.
| slaymaker1907 wrote:
| There are circumstances where it's a perfectly valid
| solution. For example, suppose you're trying to acquire the
| lock on something to destroy it. It can be the lesser of
| two evils just to leak that memory instead of just waiting
| forever/a very long time to acquire that lock. You just
| need to ensure that you aren't leaking memory too quickly
| for whatever your constraints are. For example, most
| programs wouldn't care about a 1kb/day leak of memory
| because that would take a very long time to actually become
| noticeable. Furthermore, there's pretty much always some
| degree of memory growth just due to heap fragmentation (at
| least if you're using a language like C which can't do
| memory compaction via GC).
| lelanthran wrote:
| Yeah, but the program may not be broken until you, the glibc
| maintainer, calls `raise(SIGSEGV)`.
|
| Most programs using setenv call it before starting any
| threads. That is not broken.
|
| Detecting the linkage of thread support and crashing that
| program on purpose is, frankly, a pathological way to fix a
| non-broken program.
|
| Besides which, your proposal won't work anyway, because this
| remains a potential problem in single threaded programs
| anyway: a program calling getenv, storing the result, and
| then calling setenv on the same variable and using the
| previously stored result will break anyway.
|
| In summary, your proposal is broken in two different ways: 1)
| it breaks well-defined programs, and 2) it fails to break
| broken programs.
| jeroenhd wrote:
| I wouldn't implement it during linkage, obviously single
| threaded putenv/setenv calls should still be permitted as
| part of initialisation routines. Count the number if
| children in /proc/self/task for all I care, the detection
| needs to happen during runtime.
|
| You're right that putenv/setenv are also horribly broken in
| other ways, and doing multi thread detection doesn't
| prevent those problems. In a perfect world we would just
| kill off these two functions all together, replacing them
| with either crashes or no-ops, but that'd be an even harder
| sell.
| lelanthran wrote:
| > I wouldn't implement it during linkage, obviously
| single threaded putenv/setenv calls should still be
| permitted as part of initialisation routines. Count the
| number if children in /proc/self/task for all I care, the
| detection needs to happen during runtime.
|
| That still breaks well-defined, non-broken programs which
| _don 't_ call getenv/setenv in racing ways. There is no
| way for you do a conditional-upon-threads mechanism
| without false positives.
|
| > You're right that putenv/setenv are also horribly
| broken in other ways, and doing multi thread detection
| doesn't prevent those problems. In a perfect world we
| would just kill off these two functions all together,
| replacing them with either crashes or no-ops, but that'd
| be an even harder sell.
|
| But you don't need to in order to meet your original goal
| - breaking programs which _do_ call setenv /getenv in the
| wrong order. Proposing to remove them altogether doesn't
| fulfill the goal of finding the breakages immediately and
| introduces breakages in existing programs.
|
| My alternative: use LD_PRELOAD and provide alternative
| setenv/getenv functions which raise SIGSEGV when setenv
| is called on a variable more than once, and when getenv
| is called on a variable that was already setted once. It
| requires nothing more than a counter for each of
| setenv/getenv per variable.
|
| That finds programs which actually are broken, with no
| false positives, and ignores threads altogether because
| they don't matter under the counter system[1].
|
| Best of all, you can implement this in an afternoon,
| without needing to modify glibc, and then test it with
| every single executable on your system to see which ones
| break.[2]
|
| [1] Since the caller knows they are not thread-safe
| anyway, we aren't looking for the error where the caller
| calls setenv concurrently in different threads. That's a
| different problem.
|
| [2] I would wager good money that few, if any systems,
| will break under this test.
| fch42 wrote:
| you're right; in addition to that though, I'd like to highlight
| that the use of some form of locking "inside" set/getenv would
| gain you nothing at all. That is not because of setenv, but
| because of getenv. The latter returns you a _pointer_. Whether
| you call that a reference leak, an ownership breakage ... it's
| not "yours" and when you have it, you don't "hold" it even if
| getenv internally were to lock whatever the underlaying data
| structure might be.
|
| _That_ is the issue. You can only solve that if you change the
| interface. Make a new one, getenv_r(), have it _copy_ the env
| var value into a user-provided, user-owned buffer. In that
| case, you can then assure the returned value is both point-in-
| time correct and immutable. You can never achieve that with
| getenv() because if you copy/make the returned pointer owned,
| the owner needs to free it. which is a break from the current
| behaviour and so not backwards-compatible ... and hence out of
| the question.
|
| Lamenting about how broken the interfaces might be and then
| insisting that the implementation should be fixed is ...
| "conveniently shortsighted". Not saying this isn't worth
| fixing, but fix it the right way in the right place.
| SAI_Peregrinus wrote:
| Make getenv copy to an OS-provided buffer. Free at program
| exit, like any other memory leak. There's an obvious
| drawback, but it's not changing the function signature.
| cryptonector wrote:
| > you're right;
|
| Wrong. As I've pointed out several times in this thread and
| in other recent threads about getenv(), Solaris/Illumos has
| an implementation that is lock-less to read (except when you
| change `_environ`, then it takes a lock at most once until
| the next time you change `_environ`). It's made safe by
| "leaking", and by locking in the functions that write. It's
| only unsafe if you replace the value of `_environ` repeatedly
| _and_ free the old settings (which I 've never seen any code
| do, and which if you do then you get what you deserve).
| cryptonector wrote:
| All of this is false, including the last statement. Proof by
| existence: Solaris/Illumos has a thread-safe (leaking, though
| the leaks are hidden from memory debuggers), lock-less
| (whenever you _don 't_ write to `environ`) `getenv()`, and
| thread-safe, locking `setenv()`/`putenv()`/`unsetenv()`:
| https://src.illumos.org/source/xref/illumos-gate/usr/src/lib...
|
| Yes, it can be done and has been done. glibc has no excuse.
| krackers wrote:
| Cool, ever since that rachelbythebay article [1] I was wondering
| how different libcs handle the issue! Nice to see that someone
| else confirming the behavior of apple'c libc. It's not mentioned
| in the article, but while apple's libc seems to suffer from the
| use-after-free issue, if I'm reading it right it does seem to
| have locking for setenv/getenv [3]
|
| [1] https://news.ycombinator.com/item?id=37908655 [2]
| https://news.ycombinator.com/item?id=37952916 [3]
| https://github.com/apple-open-source-mirror/Libc/blob/master...
| xxs wrote:
| The described problem (thread safety) for a global configuration
| seems mostly a misunderstanding by the author.
|
| The usual case for modifying a global state is: modify once, then
| proceed (e.g. start new threads). Even if all the calls become
| thread safe, the behavior would be inconsistent, still.
| hddqsb wrote:
| It is perfectly reasonable and consistent for one thread to set
| an environment variable while other threads are reading
| _different_ environment variables.
| rpcope1 wrote:
| I wonder if you could work around this by using LD_PRELOAD to
| load in a shim around get_env and set_env. You'd still have the
| problem of environ potentially getting mutated, but it very well
| may solve the problem if it's limited to those two functions.
| jeroenhd wrote:
| You would still need to design a fix. You'll probably either
| break programs that modify the pointer returned by getenv()
| while doing so.
|
| However, this only makes sense for other people's software
| crashing on getenv() related memory bugs. If you control the
| software, you can simply prevent the setenv() call yourself. No
| need to LD_PRELOAD anything, just load a library or write your
| own hooking code to work around the POSIX madness.
| eqvinox wrote:
| > This is a list of some uses of environment variables from
| fairly widely used libraries and services. This shows that
| environment variables are pretty widely used.
|
| Widely used, yes. Used as in read. Why do any of these need to
| change at runtime? And if they do - why are they environment
| variables?
|
| (NB: starting a new process is not "at runtime")
| withinboredom wrote:
| Changing the env during runtime is actually quite handy for
| debugging and forcing the program into specific states.
|
| Other than that, it can also be handy in k8s with a VPA. You
| get more/less memory and then update the env to reflect that.
| Your service picks up the env change and updates the runtime.
|
| IIRC, there is/was some way to listen to those changes in C#,
| and automatically update runtime settings.
| eqvinox wrote:
| > Other than that, it can also be handy in k8s with a VPA.
| You get more/less memory and then update the env to reflect
| that. Your service picks up the env change and updates the
| runtime.
|
| You... can't change the env from outside the process...
|
| are you saying this is used by disjoint components within a
| single process? Or is this just a misunderstanding?
| withinboredom wrote:
| You can spawn as many processes as you want in a container,
| did you not know that?
|
| But you only need access to the /proc/pid directory to
| change another processes env.
| eqvinox wrote:
| > But you only need access to the /proc/pid directory to
| change another processes env.
|
| /proc/$pid/environ is not writable
|
| (and as a matter of fact, due to how the environment
| works, it cannot be writable.)
| LegionMammal978 wrote:
| But /proc/pid/mem is, if you like living dangerously!
| You'd just have to parse the dynamic-linker metadata to
| find where libc's environ is hiding. (Though statically-
| linked programs would be tougher.)
| SAI_Peregrinus wrote:
| Spawning a new process doesn't _require_ changing the
| parent 's environment.
| pjc50 wrote:
| > You... can't change the env from outside the process...
|
| Not with that attitude you can't.
|
| (OK, without the joke: you can do this with an interactive
| debugger. But I think OP just meant "change it in the
| container and then restart the child process")
| rewmie wrote:
| > Changing the env during runtime is actually quite handy for
| debugging and forcing the program into specific states.
|
| Most debuggers nowadays support altering variables at runtime
| after hitting breakpoints. In the meantime this was the very
| first time I ever heard anyone even considering changing env
| vars at runtime, let alone use it to debug stuff. Sounds like
| an ass-backwards way of going about debugging.
| riffraff wrote:
| > Changing the env during runtime is actually quite handy for
| debugging and forcing the program into specific states.
|
| Wait, why would this need to happen at runtime? I have used
| env cars a lot to trigger specific cases but why would you
| want to do this while the process is running from within the
| process itself?
|
| If you control the process you can start it with the right
| env to begin with, no?
| oefrha wrote:
| Have you ever exported anything in a shell script? Sure you can
| keep the necessary changes in local state and pass those to
| execve(2)/execvpe(3)/posix_spawn(3), and that would be safe
| AFAIK, but setenv(3) is there and more convenient if you're
| unaware of the hidden dangers. Also that doesn't work for PATH
| in execvp/execvpe, which is read from the current process; how
| do you change search paths for execvp without setenv (short of
| doing the search yourself)?
|
| Edit: I just realized macOS/FreeBSD has execvP() that allows
| passing a custom search path, so PATH is now safe, but without
| a -e variant, everything else is again unsafe.
| quickthrower2 wrote:
| Shell scripts are different as you are likely exporting
| environment variables and then starting new processes.
| oefrha wrote:
| Shell scripts aren't different from "real" programs using
| exec or posix_spawn in this regard, it's just that fewer
| people have done the latter than the former, so the former
| is a more relatable example. "Real" programs spawn other
| processes too you know, sometimes with modified environ.
| quickthrower2 wrote:
| So I understand this right, I thought the issue is about
| multiple threads but in shell you wouldn't have this just
| new processes.
|
| In a program you could have either.
| rcxdude wrote:
| Shell scripts are not really prone to this problem because
| AFAIK no shells are multithreaded: subshells and the like are
| implemented with fork()
| oefrha wrote:
| Yes, I'm not saying shell scripts are affected, merely
| using them as an example to answer the question "Why do any
| of these [env vars] need to change at runtime?"
| xxs wrote:
| The discussion is only relevant for a shared unguarded
| resource (the env) modified and read by multiple threads.
| Single threaded operations are just fine.
| Someone wrote:
| > Single threaded operations are just fine.
|
| Sort of. https://pubs.opengroup.org/onlinepubs/9699919799
| /functions/g...:
|
| _"The returned string pointer might be invalidated or
| the string content might be overwritten by a subsequent
| call to getenv()"_
|
| There's little you can do with a broken API, so Linux has
| that 'feature', too. https://man7.org/linux/man-
| pages/man3/getenv.3.html:
|
| _"The string pointed to by the return value of getenv()
| may be statically allocated, and can be modified by a
| subsequent call to getenv(), putenv(3), setenv(3), or
| unsetenv(3)."_
|
| FreeBSD chooses to leak memory, instead.
| https://man.freebsd.org/cgi/man.cgi?getenv(3):
|
| _"Successive calls to setenv() that assign a larger-
| sized value than any previous value to the same name will
| result in a memory leak. The FreeBSD semantics for this
| function (namely, that the contents of value are copied
| and that old values remain accessible indefinitely) make
| this bug unavoidable"_
| xxs wrote:
| >Have you ever exported anything in a shell script
|
| So, shells use a single thread that can safely modify the
| environment - then start new child processes by the same
| thread. The child processes get a =copy= of the said
| environment. That's a textbook example how to use env.
|
| Starting multiple threads on your own, then modifying env
| should be considered a textbook example how not to do things
| - env is not intended for interprocess communication.
| anttihaapala wrote:
| In the case of execvp, you would pretty much be required to
| _fork_ before it and _then_ you could change PATH.
| oefrha wrote:
| Yeah, fork()+immediately exec() should be safe, but those
| use cases are almost always better with posix_spawn(), due
| to issues with fork(), like memory copying. And if you want
| to use the p-variant of posix_spawn you're back to setting
| PATH beforehand. These APIs designed back in Stone Age just
| aren't very well thought-out wrt concurrency and high
| performance.
| jstimpfle wrote:
| Why would you change the path just to call
| posix_spawnp()? If you want that control, that is an
| indication that you want to specify the path to the
| executable, not use PATH.
| eqvinox wrote:
| Shells don't generally use the libc environment; this would
| be too limited to implement even standard POSIX shell
| functions with local variables, or non-exported variables.
| It's much easier to set up purpose-built data structures to
| track variables, and construct an argument for execve().
|
| (Edit: removed unneeded pointing out execve)
|
| Also shells generally have their own program search anyway
| since they need to support built-in commands. It's not
| particularly hard to implement PATH search.
| oefrha wrote:
| Once again, the OP asked why setenv is even needed, which
| implies they likely don't have much experience with
| spawning processes in low level languages, so I used the
| more familiar shell script setting as an illustrative
| example, as setenv is analogous to export in POSIX sh. I
| never said export is implemented with setenv, or shell
| script exports aren't thread safe. Unfortunately, replies
| hung up on shell scripts.
|
| As for I'm not aware of execve etc... You need to re-read
| my comment which clearly mentions execve, execvep,
| posix_spawn, as well as implementing PATH search on your
| own.
| eqvinox wrote:
| > Once again, the OP asked why setenv is even needed,
| which implies they likely don't have much experience with
| spawning processes in low level languages
|
| I am the OP and your assumption is incorrect. You may
| consider why the post ends with: (NB:
| starting a new process is not "at runtime")
| Izkata wrote:
| "export" in shells has to change the environ before they
| start the new process. It may not be "at runtime" for the
| new process, but it would be for the shell.
| account42 wrote:
| Wrong, export does not _have_ to change the shell 's
| environment at all. There are plenty of exec variants
| that accept a different environment pointer, same for
| posix_spawn.
| dmytroi wrote:
| Mostly integration, for example some library can only be
| configured via env variables, but a developer might want to
| configure it from with-in the app it's integrated into and used
| from.
|
| Also, few weeks ago I found a use for them when trying to pass
| configuration from Java/Kotlin to C++ library to be used during
| static constructors (invoked during dlopen) on Android, because
| at that phase native code cannot call back to JVM.
| guappa wrote:
| > for example some library can only be configured via env
| variables
|
| library has already loaded when you call setenv, so what
| you're saying doesn't work in most cases.
|
| It seems to be a need to use poorly written libraries. You
| might consider fixing them instead.
| jonhohle wrote:
| I agree that would be a poor implementation, but the
| library could be loaded at runtime using dlopen or
| equivalent.
|
| This issue with that "interface" is the environment is
| process global. If the library is being loaded dynamically
| (specifically for some task) it would seem that the
| parameters are local to that task and should be taken by
| some reentrent init method. Alternatively, the process
| could be forked and environment set in the child without
| concern for thread safety or polluting the environment
| (think of the children!).
| guappa wrote:
| The only library I've seen to use env vars is libc, which
| uses them to decide how malloc should behave for example.
| the_svd_doctor wrote:
| Some libraries behaviour/API can be tweaked with env var.
| env var are read at runtime not loading time.
| wzdd wrote:
| Indeed -- it's an extremely unconvincing list, because any
| sensible library which may require a library user to set env
| variables (which includes all the ones I checked on the list)
| can also be configured without setting env variables. Most of
| the time the env variables set fallback defaults for parameters
| not specified by the caller. In these cases, the sane thing to
| do, regardless of the thread-safety of setenv(), is simply to
| supply the parameter in code.
|
| The only exception is things like debug logging, which is
| unlikely even to work dynamically.
|
| On the other hand, setenv() is clearly broken in modern code,
| particularly in a library context, and the man page (at least
| on my Linux machine) does not make that particularly obvious --
| "Thread safety: MT-Unsafe" is the only note, with a reference
| to attributes(7) for more information. It could definitely be
| made more obvious.
| qwertox wrote:
| Just asking: If you pass security tokens via environment
| variables to the process, doesn't it make sense to delete them
| from within the process after they have been used?
| eqvinox wrote:
| Yes it _would_ make sense, but no there is no way to actually
| ensure they have been deleted. A trivial but nonetheless very
| common case would be if your process is started with a
| wrapper shell script. But even just within your process,
| there is no guarantee at all against some random library (or
| the kernel) making a copy of the entire environment.
|
| If you want to pass secrets into a process at startup, I
| would strongly recommend passing a pipe as an additional open
| file descriptor (e.g. fd #4, but this _FD number_ you can
| then put in an env variable) and writing it onto the pipe. It
| can only be read once, and you can control where the value
| propagates.
| Zandikar wrote:
| Damn, learning new tricks everyday, thanks for the tip.
| leoh wrote:
| Testing, for one thing...
|
| I mean YES you can factor your code (tests, whatever) to make
| this a non-issue but supposing some person wrote some code 10
| years ago in an OSS project or on your team and you start
| banging into this issue.
|
| It's not going to be trivial to unwind let alone find the root
| issue.
|
| Let's start fixing things like this for our future selves,
| right?
|
| Digging heels in and saying "eh, you just got to learn this one
| weird quirk.. oh yeah this other one too.." is kind of a fun
| glass bead game until it's not; as is not a winnning way to
| endear hearts and minds.
| pitdicker wrote:
| This also caused a lot of trouble for time libraries in Rust. The
| two foundational libraries, chrono and time, rely on localtime_r
| to get the local time instead of the clock value in UTC.
| localtime_r reads the TZ environment variable (and optionally
| others like TZ_DIR). Rust declares it safe to modify the
| environment, while POSIX declares it unsafe.
|
| CVE-2020-26235, RUSTSEC-2020-0071 and RUSTSEC-2020-0159 where
| opened against the crates. That left the Rust ecosystem with a
| pretty much unsolvable issue for many months. Chrono went with
| the solution to parse the timezone database of the OS natively
| and read the environment using the Rust locks. Time tries to
| detect if the libc version has thread-safety guarantees to access
| the environment, and otherwise panics if there are multiple
| threads.
|
| More reading: https://docs.rs/chrono/latest/chrono/#security-
| advisories
| rewmie wrote:
| > Rust declares it safe to modify the environment, while POSIX
| declares it unsafe.
|
| There's your problem right there, and it ain't the behavior
| specified in the standard.
| pitdicker wrote:
| You are right. POSIX specifies one thing, the standard
| library in Rust and some other libraries specifies something
| different. 'Safe to use unless there are other threads' is
| not really something you can or want to encode in a type
| system.
|
| But libraries and users are caught in the middle.
| eptcyka wrote:
| It is safe to use the Rust standard library interface.
| pitdicker wrote:
| Unless the environment is also touched by a part of the
| program written in Go, Julia, I don't know... The lock is
| not shared across languages.
| the_mitsuhiko wrote:
| > The lock is not shared across languages.
|
| Which just to be clear: it cannot without changing the
| standard. There is really nothing anyone can do without a
| change in the standard.
| bbatha wrote:
| However, to access any of those languages from rust you
| need to use unsafe.
| eptcyka wrote:
| There is no safe way to access the environment, even if
| you mark this API unsafe, what are you going to do?
| bbatha wrote:
| You can safely access the environment so long as you use
| the rust apis and don't have unsafe code that calls
| `setenv` without synchronization.
| Sytten wrote:
| There is an issue in the std to name setenv unsafe but that
| is a breaking change so it's complicated.
| kibwen wrote:
| One problem is that marking that function as unsafe would
| unfairly penalize platforms like Windows that don't have
| this issue. Even if it turns out to be the least-bad
| compromise solution, it sure would be nice if we could have
| nice things.
| kibwen wrote:
| But Rust doesn't declare it safe to modify the environment in
| general. It declares it safe to modify the environment using
| std::env::set_var, which uses locking internally. The docs
| explicitly note that there's potential unsafety if non-Rust
| code modifies the environment:
|
| _" Note that while concurrent access to environment
| variables is safe in Rust, some platforms only expose
| inherently unsafe non-threadsafe APIs for inspecting the
| environment. As a result, extra care needs to be taken when
| auditing calls to unsafe external FFI functions to ensure
| that any external environment accesses are properly
| synchronized with accesses in Rust."_
|
| https://doc.rust-lang.org/std/env/fn.set_var.html
|
| Ultimately the problem here is with Posix. Rust can only do
| so much to paper over the pitfalls in the underlying
| platform.
|
| Although note that if you replace libc with eyra, then the
| behavior goes from thread-unsafe to "just" a memory leak:
| https://blog.sunfishcode.online/eyra-does-the-impossible/
| SkiFire13 wrote:
| > Rust declares it safe to modify the environment, while POSIX
| declares it unsafe.
|
| Arguably, Rust declares it is safe to modify the environment
| through its stdlib methods. The tricky detail is that this
| means it is unsafe to read/modify the environment through other
| means, but sometimes this is really hard to avoid.
| asveikau wrote:
| > The tricky detail is that this means it is unsafe to
| read/modify the environment through other means, but
| sometimes this is really hard to avoid.
|
| If you have C and Rust in the same process and C code calls
| setenv(3), for one ...
|
| Edit: why downvotes? It's very typical to link to C libraries
| which may call the libc environment stuff ... My point is you
| can't control library code as easily, if it's some dependency
| of a dependency eventually calling libc.
| manwe150 wrote:
| Does rust also add an pthread_atfork handler? Otherwise, it
| seems likely still unsafe for rust to claim to support
| calling fork (for execv) or posix_spawn, as most libc call
| realloc on the `environ` contents, but do not appear to take
| any care to ensure that (v)fork/posix_spawn doesn't happen
| concurrently with that. Worse yet, the `posix_spawnp` API
| takes an `envp` parameter and expects you to pass it the
| global pointer `environ`, which is completely unsynchronized
| across that fork call. It is not obvious to me that this is a
| security gap, but certainly it seems to me that this would
| violate rust's safety claim, if it is not taking added
| precautions there.
|
| The Apple Libc appears to just unconditionally drops the
| environ lock in the child (https://github.com/apple-oss-
| distributions/Libc/blob/c5a3293...), while glibc doesn't
| appear to even bother with that (https://github.com/bminor/gl
| ibc/blob/6ae7b5f43d4b13f24606d71...)
| connicpu wrote:
| I don't think Rust's stdlib provides any kind of safe way
| to call just fork(), it only has methods for creating child
| processes because that's the only interface that works on
| every supported Tier 1 platform. Calling fork is always
| going to necessarily be an unsafe{} libc call or syscall,
| and the caller will have to take care to ensure nothing
| funny is going on.
| namibj wrote:
| There are OS specific APIs where needed, probably also
| for threads.
| connicpu wrote:
| `std::os::unix does` adds some additional methods in that
| vein like exec(), but no fork(). `std::os::linux` only
| adds the ability to get `pidfd`s for child processes you
| create. There's simply no safe way for the stdlib to
| provide safe fork() without knowing a lot of things about
| how you're going to set up your process and what other
| libraries you might pull in that may not be fork-safe. If
| you're willing to ensure you only call it in a safe way,
| you can still call fork, the language just cannot
| guarantee it will be safe, same as when you're doing it
| in C.
| tsukikage wrote:
| If you are modifying TZ while another thread is relying on it
| to calculate time, those threads are racing, and hiding the
| crash won't solve the race: the reading thread will now
| randomly return values in the wrong timezone instead,
| subsequent code will use it in whatever operation it is it
| wanted the time for, the end result will be garbage, and this
| will be super hard to debug because there won't be a loud
| obvious crash pointing to the root cause and also depending on
| the winner of the race the symptoms will be
| random/intermittent.
|
| Fix the high level race, and suddenly you no longer need the
| low level mutex.
| formerly_proven wrote:
| > If you are modifying TZ while another thread is relying on
| it to calculate time
|
| environ is a single contiguous null-terminated segment of
| null-terminated key-value pairs; any change of any
| environment variable might reallocate it, changing the
| address and invalidating the old address.
|
| Also why it's a bad idea to store the pointer returned by
| getenv, it might be invalidated by any environment
| modification.
| Sprocklem wrote:
| The strings in environ is only contiguous at program start.
| In every libc I'm aware of, both putenv and setenv replace
| only the specified key-value pair (and possibly environ
| itself, if it needs to be larger) and should not affect the
| address of any other environment variables. It is still
| thread-unsafe, but far more limited in its unsafety.
| comex wrote:
| In current glibc master, it's unsafe for any
| putenv/setenv to race with any getenv, even if the
| variable names are different, for two reasons. (Note that
| multiple calls to putenv/setenv are serialized by a lock,
| but getenv does not take the lock.)
|
| (1) setenv resizes environ using realloc, which frees the
| old buffer, so getenv can end up reading from a freed
| array.
|
| (2) The code does not use atomics or memory barriers, so
| on weakly ordered architectures, getenv could observe
| another thread's write to one of the pointers in the
| environ array, or to the environ pointer itself, while
| observing stale values for the memory behind it.
|
| In both cases, getenv could end up returning a bogus
| pointer or just crashing.
|
| However, those issues _can_ be fixed without changing the
| API, and at least Apple 's libc seems to do the right
| thing here. On the other hand, other libcs such as musl,
| FreeBSD libc, and even OpenBSD libc (!) do worse than
| glibc and have no locking at all.
|
| If someone could convince the maintainers of all those
| libcs to add a lock and make getenv/setenv 'thread safe
| as long as you're not racing on the same variable name',
| then that would be a good starting point. But in my
| opinion it would still be a half-measure. We need a fully
| thread-safe environment.
|
| And honestly, it might be _easier_ to convince the
| maintainers to add a full solution than a half-measure,
| even if it involved API changes. (But it may be hard
| either way. Rich Felker showed up in a Rust thread a
| while back and was highly negative on the idea of making
| any changes to musl.)
| mjevans wrote:
| IMHO - I am sympathetic to the BSDs, Apple (presumably
| forked BSD), and musl approach.
|
| In what sane world would someone reasonable treat
| (initial shell) Environment Variables as a proper ACID
| complaint database? About the 'best' solution I can see
| for preventing segmentation faults related to resizing
| the env array during runtime is to defer reclaiming freed
| memory chunks until after all in-process threads have
| been given another uninterrupted timeslice to process.
| Even that wouldn't be 100% but probably would cover any
| not pathological case.
| bensecure wrote:
| atomic ordering is very easy if you don't care about
| performance. So on the other hand we could ask why
| get/put/setenv have such a terrible need for performance
| that we can't afford to put a simple lock around them.
| asveikau wrote:
| > the reading thread will now randomly return values in the
| wrong timezone instead, subsequent code will use it in
| whatever operation it is it wanted the time for, the end
| result will be garbage,
|
| I really strongly disagree with how bad you seem to think
| this is. If you are designing your application to use the
| timezone and modify it at the same time, it is a _totally
| natural_ consequence that you may see the previously set time
| zone in a timing dependent fashion. That 's the nature of the
| beast. To "solve this" is seemingly to make that other thread
| capable of time travel or something. It read something before
| it was written, and acted on it. Reasonable!
|
| The harmful data races are when you read _intermediate_
| results. If setting the timezone is a multi-step process, or
| involves manipulation on complex data structures with
| pointers that might be deallocated, then you are in grave
| danger. Seeing a previously valid result is ... I honestly
| don 't know how you'd expect to solve it without threads
| being able to see the future, or some other unreasonable
| expectation.
| lifthrasiir wrote:
| To be exact, it was Chrono and time-rs 0.1, while time-rs 0.2
| and later was rewritten from the scratch and didn't have that
| issue... because the new time-rs didn't yet support general
| time zones other than fixed offsets. The accepted solution for
| Chrono surprised me a lot, because as far as I reckon it was
| the hardest solution. (Disclaimer: I'm the original author of
| Chrono.)
|
| But a bad API design doesn't end at environment variables. Many
| POSIX systems rely on `/etc/localtime` to define the system-
| wide time zone, and every `localtime` call has to check if the
| file has been changed or not because there is no way to
| subscribe to the system-wide time zone change event. Of course
| there is a cache, but many libcs call at least `stat` per each
| `localtime` call AFAIK. I had even experienced a possible glibc
| bug due to the lack of guard against I/O error during this
| process [1]. Windows got this right, I can't see why POSIX
| couldn't do the same when it does have an asynchronous signal
| delivery mechanism anyway.
|
| [1] https://news.ycombinator.com/item?id=9953898
| pitdicker wrote:
| My respects for your work on Chrono!
|
| And you are right about time-rs (or I think you are). Version
| 0.1 was never fixed, and version 0.3 does the OS and thread
| count checks.
|
| It does have some advantage for chrono to do everything in
| Rust: it can now return two results for ambiguous local time
| during DST transition fold, and properly return None during a
| transition gap.
| lifthrasiir wrote:
| > My respects for your work on Chrono!
|
| Thank you. To be frank as a first-time maintainer I did a
| mediocre job---my biggest regret for Chrono is that I did
| know most forthcoming issues beforehand and yet didn't take
| enough time to make them public and explicit so that
| someone else could prepare for the future.
| wavesquid wrote:
| I believe systemd has a way to subscribe to timezone changes.
| account42 wrote:
| > Many POSIX systems rely on `/etc/localtime` to define the
| system-wide time zone, and every `localtime` call has to
| check if the file has been changed or not because there is no
| way to subscribe to the system-wide time zone change event.
|
| But you _can_ subscribe to file change events so why not do
| that?
| lifthrasiir wrote:
| I did seriously consider inotify back in time, but in order
| to take advantage of inotify I had to parse _all_ binary
| TZif files (because otherwise I still had to call
| `localtime` that would `stat` every time anyway). It was so
| cumbersome, that was only halfway finished when I stepped
| down as a maintainer. Hence my surprise when I learned that
| someone actually did implement all of them.
| mprovost wrote:
| Unix was "designed" (if you can call it that) a long time
| before it was possible to move a running system between
| timezones. So many of these decisions were made in completely
| different circumstances (I almost said environment) and are
| laying around like old WWII bombs just waiting for someone to
| dig one up.
| mcguire wrote:
| Presumably, GNU Hurd will fix these issues without
| introducing fun new ones.
| mjevans wrote:
| Offhand, and a quick google search; I was unable to find the
| exact definition / specification for how time-zone data must
| be obtained rather than how it happens to conventionally be
| obtained.
|
| It is entirely reasonable that any of the following _might_
| be valid behavior.
|
| * Simple but syscall heavy approach which re-reads the env,
| and possibly /etc/localtime each call and has no stability.
| (Results may mutate as other processes / threads change
| things.)
|
| * Same as above, then caches the decision result for some
| application specific reasonable time; which may be until the
| application exits.
|
| * The elsewhere mentioned stat / inotify approaches that only
| track updates to /etc/localtime (and ideally update the
| cached decision result when notified).
|
| All approaches seem valid. It's sort of like the hostname or
| any other system level configuration where a reboot may be a
| reasonable expectation for a complete update.
| Sytten wrote:
| The shame of those CVE is that it created a split in the rust
| community between chrono and time. For a time it looked like
| people were all moving to time (which handling on TZ is a bit
| stupid IMO since it just refuses to work if there is more than
| one thread). But with chrono 0.4 now things are stale and there
| is no clear winner anymore.
|
| I would argue that those splits are in great part responsible
| for the feeling that rust is hard to learn. I remember to have
| had to dig into pretty complex time code to understand why it
| broke our program that relied on timezone when we switched from
| chrono to time. It hinders your productivity for sure even if
| you learn the how.
| vintermann wrote:
| Using environment variables for global mutable state isn't
| exactly good practice, is it?
|
| I can't think of any time I had wanted to do that.
|
| What exactly are the programs that break if this changes?
| okr wrote:
| I can think of using libraries, that get their config through
| environment variables. So you start your program, modify
| environment, and then start up the rest.
| quickthrower2 wrote:
| Does the "and then" protect you here?
| okr wrote:
| No, not really. You never know when the library reads the
| enviroment variables. But i can make sure, that the parts
| under my control do not modify them after start.
| johanbcn wrote:
| Libraries shouldn't be reading environment variables, that
| responsibility should be only for the application.
| kuon wrote:
| It is common, out of my head SDL and pipewire do it.
| lelanthran wrote:
| They shouldn't, but if even the C standard library reads
| environment variables, then the bar is set pretty low for
| library designers.
|
| Look up the mess around LOCALE sometime.
|
| The best the programmer can do is perform all the setenv
| calls before spawning any threads or making any library
| calls.
| Someone wrote:
| I may misunderstand what "Extension to the ISO C standard" can
| mean, but _getenv_ isn't thread-safe, either.
|
| https://pubs.opengroup.org/onlinepubs/9699919799/:
|
| _"The getenv() function need not be thread-safe"_
|
| I expect most if not all implementations are more robust.
| numeromancer wrote:
| It says more than that:
|
| "The returned string pointer might be invalidated or the string
| content might be overwritten by a subsequent call to getenv()"
|
| You don't even need threads for this to be unsafe; another call
| to the same function may invalidate earlier-gotten pointers. I
| don't see how to interpret this as anything but broken.
| yason wrote:
| Come on! Of all C/Posix thread-safety issues in circulation this
| can plausibly be considered the most moot one.
|
| Environment variables are not meant to be an inter-thread
| communication channel and the documentation that points out
| setenv() is not thread-safe is very much a fair shot.
|
| You rarely, if ever, need to setenv() anything maybe unless
| you're a shell. For spawning children execve() already takes an
| envp parameter. For debugging I think I've mostly set in-process
| environment variables manually from gdb.
|
| Further, because environment variables are an interface between
| the process and its environment you typically read environment
| variables at start and cache the parsed values in some internal
| location. If you need to change that global state on the go you
| should do it using your own internal variables instead of
| recycling it through the environment and having the program
| threads repeatedly getenv() the updated values.
| usrbinbash wrote:
| > C Doesn't Want to Fix It
|
| Or: C knows that _it doesn 't need fixing._
|
| How often do I need to `setenv()` anything? The answer is "Never"
| in the vast majority of programs, because ENVVRS are usually read
| rather than set, so this issue is nonexistent for them.
|
| For the vast majority of the small amount of programs that
| actually need to use `setenv()`, the answer is: "Maybe once or
| twice during the entire lifetime of the process, and then only at
| the very start, probably even before running any threads",
| meaning this issue is nonexistent for them as well.
|
| So, is there a potential issue with thread safetey? Yes. Does it
| matter given where and under what circumstances it occurs? Not
| really.
|
| > such as Go's os.Setenv (Go issue)
|
| Here is the link to the "issue":
|
| https://github.com/golang/go/issues/63567
|
| What kind of actual real life production code would continuously
| set envvars while simutaneously calling a function that tries to
| read the environment?
|
| Yes, this is a footgun. But even the issues author acknowledges,
| in the issue thread: Realistically: this is a
| pretty rare problem, and documenting it is probably a
| fine solution. This is probably going to cost someone
| else a couple of days of debugging every couple of years
|
| > It has wasted thousands of hours of people's time, either
| debugging the problems, or debating what to do about it.
|
| Source?
| MrBuddyCasino wrote:
| Unpopular opinion: Neither Go's _os.Setenv_ nor Rust 's
| _std::env::set_var()_ should exist. I was pleased to find that
| Java only has _System.getEnv()_ , but not a setter.
| usrbinbash wrote:
| That is an unpopular opinion for the simple reason that some
| programs do in fact need to set envvars, particularly
| programs that will start child processes.
| MrBuddyCasino wrote:
| That is still possible using the _java.lang.ProcessBuilder_
| API: you can launch a child process and give it a modified
| environment, but just at launch time. This side-steps the
| issue.
| ryan-c wrote:
| Programs that need to set environment variables for child
| processes should use `execvpe` or `execle`.
| account42 wrote:
| Or posix_spawn / posix_spawnp.
| badsectoracula wrote:
| Or programs that rely on libraries that for some
| unfathomable reason expose some functionality only via
| environment variables without an API.
|
| _looks at SDL_
| account42 wrote:
| There is SDL_SetHint [0] which doesn't modify the
| environment but instead changes the value internally to
| SDL only.
|
| [0] https://wiki.libsdl.org/SDL2/SDL_SetHint
| pajko wrote:
| Nope, execve() and friends ending in 'e' accept a pointer
| to a completely new set of environment variables, no need
| to do setenv. Windows has _execve() too.
| usrbinbash wrote:
| The fact that there is an alternative doesn't change the
| fact that a lot of software relies on the worse method to
| work.
| naniwaduni wrote:
| It makes it a pretty silly idea to invite people writing
| new programs in a new language to use that method,
| though.
| rerdavies wrote:
| A lot of software doesn't modify the environment when
| exec-ing.
| kevincox wrote:
| I think the probably is really that there are 2 times where
| you should be setting env vars 99% of the time.
|
| 1. Right after program startup before any threads are
| spawned.
|
| 2. After a fork before an exec.
|
| In both cases it can be known that no threads are running.
| (Ok, for 1 it can actually be non-trivial if you have code
| before main or if you call functions that spawn helper
| threads, but let's assume that you can know this).
|
| However no languages actually have ways to enforce this. So
| the APIs can be called at any time and are huge footguns.
|
| I think that the proposed improvement of `getenv_s` is
| great. It is cheap and easy to use, then software can
| slowly migrate off of the less safe stuff. You can imagine
| that if libc stopped using `getenv` internally most of this
| problem would be solved.
| Blikkentrekker wrote:
| No, in many cases one needs to set them interactively.
|
| Consider for instance something as simple a implementing
| a shell. Such a program needs to be able to set the
| environment based on user interaction and this change
| needs to show up in /proc/$pid/env.
| __david__ wrote:
| Why does a shell need its current environment to be
| visible in /proc/$pid/env (as opposed to just its initial
| environment)?
| rerdavies wrote:
| If you need to set environment variables for child
| processes in a thread-safe manner, use execvpe or execle.
| jeroenhd wrote:
| I think there are good reasons for Setenv and set_var to
| exist, but if they are implemented, they shouldn't be
| wrappers around POSIX' shitty API and implement their own
| environment variable system instead (one of which the initial
| variables are possibly initialised by a call to getenv to
| make them compatible).
|
| There's no reason why these languages need to restrict
| themselves the same way C does.
| MrBuddyCasino wrote:
| The bug in Golang was because DNS lookups interact with the
| C library, which looks up environment variables. As long as
| everything happens in Goland, there is no problem - but
| this is simply not good enough.
| jeroenhd wrote:
| Go makes the assumption that the DNS lookups are thread-
| safe, but it doesn't have that guarantee (or the C
| library is spec-incompliant, but I doubt that). It's
| still something Go can fix.
|
| You can't fix C libraries loaded into Go programs (i.e.
| and external library calling C's setenv, or I suppose
| explicit FFI calls by the user), but Go can be
| responsible for the APIs it calls itself. That may
| necessitate writing a thread-safe alternative for DNS
| lookups, or documenting and/or adding compile time
| warnings that threaded programs doing DNS lookups will
| just crash sometimes, but the language's standard library
| can still make it much harder for developers to write
| buggy code.
| MrBuddyCasino wrote:
| My impression is that this was Golang's plan from the
| start - this is why they didn't want to use the C stdlib
| at all, issuing the Kernel syscalls directly from the
| Golang runtime. A good idea, but then they had to
| backpedal to solve issues such as DNS resolution
| respecting certain OS settings, and this bug is a symptom
| of that.
| fch42 wrote:
| Yes, there are certain things in UNIX which _are_ part of
| the standard (POSIX / IEEE1003) but _aren't_ usually
| implemented as system calls.
|
| Name lookups (whether user identities or network
| resources) are the biggest chunk of these. You have a
| "choice" as a user/programmer here. Say, the existing
| name lookup interfaces in most libc implementations don't
| do DNS-over-HTTP (DoH); you can implement that yourself
| and just use the addresses returned by your
| library/package where the system calls ... want
| addresses.
|
| If you have the go stance, go all the way. Don't say "the
| C runtime is sh*te but I really really really want that
| one particular teensy tiny bit of it could someone
| somewhere somehow please do something to make it a little
| less sh*te". Legacy baggage is a burden and backwards
| compatibility shackles you. The C/Unix interfaces are
| full of this, and with the hindsight of 50 years noone
| today, not even "C programmers", would implement them all
| the same way again. But that doesn't mean their behaviour
| can be arbitrarily changed.
| o11c wrote:
| > Go makes the assumption that the DNS lookups are
| thread-safe
|
| DNS functions _are_ thread-safe.
|
| The thing people aren't understanding here is when you
| set loose nasal demons (such as by calling `setenv` in a
| multithreaded program), they can cause problems even in
| safe code.
| dwattttt wrote:
| If a function is safe only if everyone else you rely on
| never calls a particular function, it's not that safe.
| Certainly less safe than other functions guaranteed not
| to result in crashes if you use them right.
| OskarS wrote:
| That doesn't fix the problem: these languages has to be
| able to coexist peacefully with C in the same address
| space. You can have a dynamically linked library written in
| Rust in a host program written in C, you can use C
| libraries in Go, etc.
|
| Even if that wasn't an issue: this is a bug in C as well!
| You should absolutely be able to use setenv/getenv safely
| in multi-threaded C, it's insanity that you can't.
| grodriguez100 wrote:
| I fully agree with your unpopular opinion.
| jeroenhd wrote:
| > Or: C knows that it doesn't need fixing.
|
| People don't like APIs that can randomly crash your program
| while there's no good technical reason for why they should. Why
| not fix the problem? People like you, who have no issues with
| the current implementation, won't see any regressions because
| you're already a good citizen, and myriad other programmers
| whose programs do occasionally crash because of this will be
| helped.
|
| > So, is there a potential issue with thread safetey? Yes. Does
| it matter given where and under what circumstances it occurs?
| Not really.
|
| "The unpredictable crashes only happen very rarely" doesn't
| mean the crashes go away.
|
| > What kind of actual real life production code would
| continuously set envvars while simutaneously calling a function
| that tries to read the environment?
|
| The reproduction sample calls setenv in a loop so the issue can
| be reproduced. A single setenv anywhere in the code is enough
| to trigger the crash, but then you would get one of those "you
| need to run the program a million times to reproduce it" bug
| reports that gets pushed down the line.
| usrbinbash wrote:
| > Why not fix the problem?
|
| Because doing so breaks backwards compatibility, simple as
| that.
|
| The problem isn't even that `setenv` isn't thread save. The
| problem is that `getenv` returns a `*char` directly into the
| environment memory space. Many many many programs rely on
| that being the case.
|
| > People like you
|
| People like me would like every software to be perfect, but
| that's not the world we live in, so we are forced to be
| pragmatic. When fixing something causes more problems by
| breaking backwards compatibility promises, than it prevents,
| then there is no good argument for a fix, and the correct
| approach is to say "yes, this sucks, let's document it well
| so people don't waste too much time on this".
|
| The setenv/getenv problem is such a case. Anyone who
| disagrees is free to fork glibc, implement whatever fix they
| think is adequate, and then try to compile the software
| packages found on a typical Linux server against the result.
|
| > so the issue can be reproduced.
|
| "Can be reproduced" and "is a common issue in production
| code" are not the same.
|
| Fact is, almost all production programs that set envvars, do
| so once, very early in the process lifecycle, and then never
| again, and so are never affected by this.
| mastax wrote:
| So why not implement the fix suggested in the article:
| improve the existing interface to the extent possible, and
| introduce a new interface which is easier to use correctly.
| rerdavies wrote:
| So why not implement it yourself, instead of polluting
| the standard runtime with functionality that nobody
| needs?
| fch42 wrote:
| There is nothing to "improve" on the existing interface,
| really. From a C point of view ... a _hidden_ global lock
| is worse than no lock at all. Because in the latter case
| ... you, as the programmer, have a choice what to do. If
| you never call setenv(), no locks. If you only ever call
| setenv() in your startup code, no locks. If you only ever
| call setenv() after fork&co, no locks. And if you do
| believe you need to call it at runtime, but are
| singlethreaded ... still no locks. And if you really
| really really need to call it from a multithreaded
| process, concurrently with getenv(), then lock around
| both and make your getenv() "safe" wrapper create you an
| owned point-in-time copy - basically a getenv_r().
|
| Note also that "global references" like getenv() returns
| and point-in-time owned snapshots don't behave the same
| way. Say, a library initializer code could retrieve a
| number of env var references by calling getenv(), and
| then use those at runtime. No more need/use for getenv()
| again after - and even perf-sensitive code could look at
| the env var. With a func that copies, the perf-sensitive
| code would need to do that each time (lock, lookup,
| copy). Not strongly desirable.
|
| Also ... UNIX is rather flexible ... and if you so wish,
| you _can_ substitute _your own_ setenv()/getenv() by the
| magic of dynamic linking. To create a set that locks and
| returns you leaked copies (changes the semantics of
| getenv so that the caller must free the pointer to avoid
| a leak). It's all possible to do this.
|
| I'm getting the impression from this that we see a "go
| tantrum" here. "I make my own standards but I wanna use
| that C/Unix standard thing as well but not how it is
| because it's not nice it should take go into account
| waaaahwaaah ...".
|
| It is not _nice_ to modify your own env at runtime.
| Maybe, just maybe ... that's for reasons. Because not
| everything that can be done is also a great idea.
| wredue wrote:
| The real skinny of it is that it's in the name:
| "Environment".
|
| If you're calling setenv in the middle of your program, you
| fucked up.
|
| There are those things in programming that should be
| extremely triggering to your "what the actual fuck?!" senses,
| and "setenv in the middle of runtime" is one of those things.
| kstrauser wrote:
| True, but for every envvar a program reads, _something_
| called setenv on it originally. It's not like _no_ programs
| call setenv in the middle of runtime. Examples:
|
| - Shells
|
| - CI runner
|
| - Container launchers
|
| - IDEs
| tsukikage wrote:
| The child process's environment for these purposes is
| constructed without mutating its parent's environment - a
| copy is used - and before the child process actually runs
| the target code it was created to run. So there is no
| possibility of race between mutations to the environment
| and reads of the environment. If you are writing such a
| tool but doing something other than this, you are doing
| it wrong.
| DSMan195276 wrote:
| > but for every envvar a program reads, something called
| setenv on it originally
|
| That's not true, that's just misunderstanding how it
| works. `execve()` takes an entirely new copy of
| environment variables to give to the child, that's the
| "real" way to do it.
| stefan_ wrote:
| No, a process gets its environment variables from the
| operating system (just like argc, argv) before any code
| is ever executed and the majority never change them.
| ric2b wrote:
| Then why does setenv even exist? Maybe that's the issue and
| it should be deprecated and throw compilation warnings?
| zare_st wrote:
| > People don't like APIs that can randomly crash your program
| while there's no good technical reason for why they should.
| Why not fix the problem?
|
| I think you're not seeing this from the right POV. People
| that consume POSIX API need to know POSIX API.
|
| https://pubs.opengroup.org/onlinepubs/009604499/functions/se.
| ..
|
| It says loud and clear "The setenv() function need not be
| reentrant. A function that is not required to be reentrant is
| not required to be thread-safe."
|
| > "The unpredictable crashes only happen very rarely" doesn't
| mean the crashes go away.
|
| If you get a crash over setenv() reading the manual page of
| setenv C call should be your first step. And the only step.
| The bigger issue is in design of application that has wrongly
| assumed setenv() is thread-safe. That requires a refactoring
| and is solely due to developer misunderstanding the API.
| loup-vaillant wrote:
| You're blaming the victim here.
|
| Not being re-entrant makes the user-facing API
| unnecessarily complicated. It creates an avoidable foot
| gun. It trips people up for no good reason. And unlike
| stuff like signed integer overflow, there doesn't even seem
| to be a (dubious) performance argument to justify this
| insanity.
|
| The standard should be fixed and that's the end of it.
| tikhonj wrote:
| "RTFM" is not a coherent defense for awful API design and
| we shouldn't accept it as such.
| zare_st wrote:
| Who is "we"?
|
| I'm a UNIX/C programmer for decades and I don't care
| about this.
|
| There is no such thing as beautiful API design. Every
| design is a compromise. If you think non-reentrant calls
| should be deprecated in POSIX take it to the committee.
|
| There is a myriad of non-reentrant code both in POSIX
| spec and in libc implemenations. You need to RTFM, I'm
| sorry.
|
| There is no "coherent API" as far as null termination
| goes too. Some library functions deal with it, some calls
| don't. You need to RTFM.
|
| I also want to know OP's reason to even use setenv() in a
| multithreaded piece of software. It's like an oxymoron.
| setenv and vars are useful to pass on data from parent
| process to forked children because they inherit the
| environment. If you use the threading model you don't
| need it. If your application is a single process setenv()
| is useless.
| usrbinbash wrote:
| What should we accept? That every library is made under
| the assumption that it has to work as expected, even if
| people ignore the documentation?
|
| As someone who made and maintains multiple libraries: No.
| Not gonna happen.
| JohnFen wrote:
| Putting aside whether or not the design is awful, the
| fact that it's standardized and documented is absolutely
| a valid argument. Changing it now would break backward
| compatibility. That should always be a showstopper.
|
| Programmers who are using any library code without
| reading and understanding the documentation are asking
| for trouble regardless of language.
|
| The correct solution to your objections is to create new
| functions that behave as you prefer.
| xbar wrote:
| While I am sure that thousands of hours have been spent
| debugging threaded setenv() attempts (and developing &
| discarding Annex K), it is clearly not a problem that needs a
| solution.
|
| Languages that compile to C need be careful not to promise
| thread-safe implementations of POSIX or C functions that are
| explicitly documented as not reliably thread-safe, including
| setenv(). The author seems to want to change C, and POSIX, so
| that Go can reliably do so.
| loup-vaillant wrote:
| You are literally putting forth arguments in favour of fixing
| the thread safety issue, and then conclude it's not worth the
| effort.
|
| It's simple, really: we indeed rarely to `setenv()`. So it's
| not a performance problem. So we can make it thread safe, and
| the performance impact will be negligible. In exchange for this
| small price, safety will increase.
|
| Sacrificing any amount of safety for a negligible improvement
| in performance is flat out unprofessional, and should be
| grounds for immediate termination in most contexts.
| DSMan195276 wrote:
| How do you propose making it thread-safe? The real problem
| here is that `getenv()` was designed around it returning a
| `char *` into some read-only memory. It's a bad API if the
| backing data can change because the returned pointer is
| assumed to exist 'forever'.
|
| `setenv()` has no way to knowing where those pointers are
| floating around so there's no way to safely change the
| environment variables. The best you could do would be to leak
| memory every time you set new environment variables so that
| the old pointers don't get invalidated, and that just creates
| a new problem and reason not to use `setenv()` (that's
| arguably worse).
| cnity wrote:
| Here's my proposal: Introduce a new threadsafe API
| (`tgetenv` or whatever) which takes _two_ `char *`s, one of
| which is a return buffer. This leaves allocation as a
| responsibility of the caller.
|
| And then you can leave the existing syscalls as they are
| (thread unsafe) while having a separate thread safe
| version.
| DSMan195276 wrote:
| I agree that would be the way to do it, but now we're no
| longer talking about simply 'fixing' the implementation
| of the existing API but rather introducing a new function
| you have to use.
|
| `setenv()` would only be safe if your program never uses
| `getenv()`, and calls to `getenv()` are so numerous and
| all over the place that for most non-trivial programs it
| would be hard to ensure they never happen.
|
| There's also the rub that `setenv()` is not part of the C
| standard, it's POSIX. I don't think the C standard would
| ever introduce `tgetenv()` to fix a problem it doesn't
| have, so non-POSIX code would have to continue to call
| `getenv()` since that's all that is available to them.
| josefx wrote:
| > I don't think the C standard would ever introduce
| `tgetenv()` to fix a problem it doesn't have
|
| The C standard has no problem acknowledging that getenv
| is subject to data races for most of its implementations.
| As far as I can tell that part was even added at the same
| time as threading support.
| DSMan195276 wrote:
| Well actually I'll have to eat my words on that one - I
| didn't catch that Annex K in C11 includes `getenv_s`
| (even if it is optional).
| leoh wrote:
| >you have to use
|
| I mean, why not just deprecate the old one; add a warning
| if it's used
| DSMan195276 wrote:
| That doesn't really help you determine whether a given
| library is using `getenv()` or not. That also requires
| that things are actually recompiled/updated, which for
| some C libraries is not that often.
|
| There's also the rub that many C programs do not target
| the latest standard (for a variety of reasons). I didn't
| realize `getenv_s` was added in C11 (though it's
| optional), but it doesn't really matter because
| programs/libraries that target C89 or C99 can't use it
| anyway.
| salawat wrote:
| GNU convention is <funcname>_r for a reentrant version of
| a non reentrant function.
|
| I'm in the process of working on a tool in C at the
| moment, so for once I actually have some context on
| what's being grumped about here!
| JohnFen wrote:
| Entirely this. It works so well that I've seen this in
| various utility libraries for decades.
| forrestthewoods wrote:
| > to safely change the environment variables. The best you
| could do would be to leak memory every time you set new
| environment variables so that the old pointers don't get
| invalidated, and that just creates a new problem and reason
| not to use `setenv()` (that's arguably worse).
|
| Arguably worse? My goodness no.
|
| This is a rare edge case that most programs don't
| encounter. Option 1 is to crash and explode and die. Option
| 2 is to leak tens of bytes.
|
| Leaking tens of bytes is for sure NOT worse than crashing.
| marcosdumay wrote:
| > Leaking tens of bytes is for sure NOT worse than
| crashing.
|
| I do really disagree here. The answer is not clear at
| all.
|
| But then, you are mischaracterizing the problem. The
| issue is not with crashing, you can get plain bad data
| too, and this is clearly worse than both leaking memory
| and crashing.
|
| Also, the GP is mischaracterizing the options. You don't
| need to leave the old values around, you can just copy
| them into userspace memory.
| DSMan195276 wrote:
| My reasoning is simple - the issues here can be avoided
| if you're careful about how you use `setenv()` and
| `getenv()`, which many programs already are. The memory
| leak in contrast would never be avoidable regardless of
| how you use it.
| forrestthewoods wrote:
| The problem with "be careful" is that libraries often
| want to use the very unsafe API and there is no standard
| mechanism to expose safety. It's fundamentally a bad API
| design. It could be good. But it is not.
|
| There's a reason this problem comes up on HN once or two
| a year. And don't even get me started about printf
| grabbing a mutex for a stupid locale...
| none_to_remain wrote:
| You could even put a note in the man page that `setenv()`
| will leak memory. Then ten or twenty years from now there
| will be a blog post about how a currently trendy
| language/runtime can be manipulated into looping over
| `setenv()` zillions of times and OOM'ing, and comments
| about how no one can possibly be expected to read the man
| page for this horrible footgun, and it's wrong to expect
| developers to have any idea about what they're doing,
| give a shit, or pay attention at all.
| leoh wrote:
| This is not a good argument imo. Its "rarity" still affects a
| tremendous number of folks in profoundly vexing ways that are
| difficult to debug on account of this not only affecting C
| but innumerable other languages' compilers and interpreters
| that rely on the stock getenv implementation.
|
| I wouldn't be surprised if a good chunk of compilers and
| interpreters in other languages suffer from this gotcha'.
|
| I mean, I wouldn't even be surprised if some _JVM_
| implementations silently expose their users to bugs on
| account of this implementation.
|
| EDIT: ... ha, sure looks like it https://github.com/openjdk/j
| dk/blob/a2c0fa6f9ccefd3d1b088c51...
| usrbinbash wrote:
| > You are literally putting forth arguments in favour of
| fixing the thread safety issue, and then conclude it's not
| worth the effort.
|
| Yes. I do. These two concepts don't contradict each other.
|
| > No it's not a performance problem. So we can make it thread
| safe, and the performance impact will be negligible
|
| Who said anything about performance being the problem, or a
| reason not to change it!?
|
| The problem is _BACKWARDS COMPATIBILITY_. The issue is that
| `getenv` returns a `*char` into the envvar array. Basically
| every application that uses this function relies on this
| fact.
|
| So we have:
|
| a) A potential issue that occurs only in very unusual
| circumstances, most of which will never occur in production
| code and on the odd chance that they do, they can easily be
| avoided. Documenting that well can help prevent time wasted
| in debugging.
|
| b) A fix that may prevent a) but breaks backwards
| compatibility promises, and would necessitate reworking god
| knows how many programs, the vast majority of which were
| never impacted by the issue in the first place.
|
| Of these 2 options, a) is just the better one. Yes, in an
| idea world, we could have pure, 100% bug free code, and spend
| an unlimited amount of time on fixing every last problem.
| That's not the world we live in however, and so a pragmatic
| approach is simply a necessity.
| zlg_codes wrote:
| Ooh, the old 'unprofessional' epithet! What do you mean by
| that slur here? Most can't agree on what professional even
| means. Additionally, why should one be held to artificial,
| inconsistent, and poorly defined standards of
| 'professionalism' when they aren't a professional?
|
| My care for code robustness scales with income.
| Galanwe wrote:
| I don't quite get why this is a problem per se.
|
| I mean, the environment is just a chunk of memory made available
| to the process by the OS. It is no more, no less thread safe than
| any other chunk of memory.
|
| Why would the libc need to protect it more than any other memory
| location?
| fooker wrote:
| Changing any kind of global state is fundamentally not thread
| safe.
|
| Sure you could use locks, stop the world, etc, but there is no
| way you can ensure that all the data and information you had
| derived from the old state is going to be valid.
|
| A better solution is to not rely on global state like this.
| larschdk wrote:
| What is the use case for a mutable environment past
| initialization?
|
| It seems like a complicated and error prone thing to be using no
| matter if it is thread safe or not. You can set up your own
| environment before you launch threads, and you can launch child
| processes with a different environment from the current process
| without modifying your own. If you fork, you can modify the
| environment in the child without affecting the parent until you
| exec.
|
| And even setenv() if it was reentrant and couldn't cause crashes,
| it wouldn't be thread safe, since threads share the environment
| and could get their environment changed under their feet.
| 1over137 wrote:
| >We should apparently read every function's specification
| carefully, not use software written by others, and not use
| threads. These are unrealistic assumptions in modern software.
|
| The first of the three listed items you should certainly do. I
| hope this author is not writing medical software, or anything
| important.
| faiD9Eet wrote:
| I am trying to make sense of the argument of pushing
| configuration into a library: * if the library
| is just a dependency, the Linux loader will set it up. It will
| have the same environment as the other libraries and as the main
| program. * if the library is set up by dlopen(), there is
| no way to provide an environment pointer
|
| Altering the global environment variable for child processes
| makes no sense, for execve()
|
| accepts an char* envp[]
|
| . So I guess we need to talk about issues with a specific use
| case of dlopen()
| planede wrote:
| Maybe the dlopen issue could be hacked around by dlmopen and
| injecting getenv and setenv symbols that access a different
| environment variable list than the application's.
| devnonymous wrote:
| Everytime I hear someone lament about something not being thread
| safe, what I actually hear is - I want this shared invariant
| global state to be modifiable but it isn't. Which makes me ask
| the question why would you want that ?
|
| The process environment should not be the mechanism for threads
| to communicate with each other.
| sebstefan wrote:
| >The argument is that the specification clearly documents that
| setenv() cannot be used with threads. Therefore, if someone does
| this, the crashes are their fault.
|
| Oh, C is taking the php approach :)
| Ayesh wrote:
| In fairness to PHP, almost every PHP 7+ version from the 5-6
| years had things deprecated especially to "cure" these things.
| fsckboy wrote:
| the env is unix. It's not C's job to fix. Keep studying it till
| you understand that.
|
| and don't forget, you are using unix because it defeated all the
| other options, because it was better and they were worse, so also
| keep studying till you understand why that is too.
|
| then this problem with env will fix itself.
|
| unix gives you tools to handle threads. C gives you tools to
| handle threads. Learn them, use them.
| jeroenhd wrote:
| On Windows you can use
| GetEnvironmentVariable/SetEnvironmentVariable (on XP and later),
| which do implement some locking and doesn't run into this issue
| because GetEnvironmentVariable copies the data out into a caller-
| supplied buffer. getenv_s was a nice effort, but it failed.
|
| I don't really understand why other languages such as Go and Rust
| decided to call the weird POSIX API rather than implementing
| their own API, which matches the semantics they expect. In cross
| platform C you'll be stuck with the outdated POSIX API design,
| but there's no reason why other languages should accept those
| same limitations.
|
| We're not running on PDP-11s anymore. You can afford a thread-
| safe hash map in your standard library. Ignore the limitations of
| the old C library. Twenty years ago, Microsoft released a better
| API, keep the crashy old API with tons of deprecation warnings
| (hell, add a compiler flag --enable-broken-c-api-designs) and
| just provide new APIs that are actually usable in modern
| programming environments.
| pjmlp wrote:
| Because as languages born from UNIX culture, the community
| usually sees POSIX everywhere, and then there are the other
| OSes.
|
| Have some fun reading on how Go handles files.
| jeroenhd wrote:
| Go likes to ignore edge cases ("all file names are UTF-8 and
| if they aren't then we'll just pretend they are") to make it
| easier to write code, so I'm not very surprised that it got
| caught in a POSIX related crash here.
|
| It's hard to tell if Microsoft altered the source code since,
| but the leaked XP source code (https://github.com/tongzx/nt5s
| rc/blob/master/Source/XPSP1/NT...) doesn't seem to do any
| getenv() calls for DNS lookups. The specific bug that started
| all this nonsense only triggers on (specific) Unix
| implementations. Unfortunately, Go opts to call the POSIX
| methods rather than
| GetEnvironmentVariable/SetEnvironmentVariable on Windows, so
| I suppose it's still possible that somewhere in the chain
| this bug gets triggered by Go code.
| pjmlp wrote:
| On non-UNIX platforms stuff like getenv() belongs to the
| specific compiler C library, not the OS API, hence why
| Windows doesn't use it.
| SAI_Peregrinus wrote:
| Non-Linux. Most other UNIX platforms also have syscalls
| depend on the specific C library.
| pjmlp wrote:
| On UNIX there is no distinction between C standard
| library and OS APIs, hence POSIX as the UNIX bits missing
| from ISO C.
|
| I feel you're mixing OS APIs, with the low level
| mechanism to enter into the kernel space.
| lifthrasiir wrote:
| Once you start to provide C interoperability, it is inevitable
| because C programs still rely on that broken API and many users
| would expect that Rust will give the same time zone as C. And
| with an exception of Windows, that API is often the single
| existing API throughout the entire system.
| Measter wrote:
| Rust's stdlib's API is completely safe here. On Windows[1], it
| uses the GetEnvironmentVariable/SetEnvironmentVariable API,
| which as you noted doesn't have this problem. On Unix[2], it
| maintains its own RwLock to provide synchronisation.
| Additionally, Rust's API only gives out copies of the data, it
| never gives you a pointer to the original.
|
| The problem comes when you do FFI on *nix systems, because
| those foreign functions may start making unsynchronised calls
| to getenv/setenv.
|
| [1] https://github.com/rust-
| lang/rust/blob/master/library/std/sr...
|
| [2] https://github.com/rust-
| lang/rust/blob/master/library/std/sr...
| masklinn wrote:
| Doesn't the essay contradict itself?
|
| It states that glibc "never free[s] environment variables], but
| then goes on to state
|
| > [in glibc] if a thread calling setenv() needs to resize the
| array of pointers, it copies the values to a new array and frees
| the previous one
|
| Since envvars cause crash under glibc, I assume the initial
| assertion is incorrect.
| Tobu wrote:
| There are two level of pointers: the environment block points
| to an array of pointers to C strings, this higher-level pointer
| can be updated and the previous one freed, which is a problem
| when it is being iterated on (which getenv does). The C strings
| themselves aren't freed by glibc, though some applications do
| modify them in place.
| tsukikage wrote:
| Hot take: if a crash can be made to disappear just by adding a
| mutex inside setenv(), this means code reading the environment is
| racing with code writing to it, and in this situation adding a
| mutex inside setenv() will generally make things worse instead of
| better: it may hide the immediate symptom, but the underlying
| race between reading and writing to the environment remains, your
| program will behave differently each run depending on who wins
| the race (with potentially catastrophic results depending on what
| the writes are doing), and the cause will be much harder to debug
| from cold due to the lack of a smoking gun pointing at
| environment manipulation.
|
| The multithreaded program needs to be restructured so that the
| parts that communicate via the environment are properly
| serialised with respect to each other, just as would be needed
| for any other communication via global state and/or access to a
| shared resource.
|
| This has to happen at a higher level than the individual
| getenv/setenv calls: entire blocks of logic containing the calls
| need to be made atomic (or otherwise refactored; perhaps you
| could do all the environment writes before spawning any threads)
| so that no other thread can blow away the environment contents in
| between the code that sets it up for some purpose and the code
| that implements that purpose; and once this is properly done, the
| individual calls themselves do not need further protection.
| hddqsb wrote:
| Sure, some applications might require custom higher-level
| synchronisation, but it's still important for getenv/setenv to
| be thread-safe (i.e. not crash):
|
| - The race might be irrelevant (e.g. simultaneous calls that
| access different variables are fine).
|
| - The application author might not have complete control over
| all calls to getenv/setenv (e.g. if using a third-party
| library).
| dark-star wrote:
| If you're doing setenv in multiple threads in parallel and can't
| afford the overhead of wrapping it in a mutex or whatever, then
| you're clearly doing something wrong...
| kukkamario wrote:
| That isn't enough. You'd need to wrap getenv calls as well and
| that can't be done as there are hidden getenv calls in stdlib
| implementation. There isn't a trivial fix.
| tsukikage wrote:
| if you're doing setenv in multiple threads in parallel, you
| clearly don't actually care about what state the environment is
| in at any given time, and therefore a better optimisation would
| be to not bother with the setenv calls at all
| planede wrote:
| So you make it thread-safe. Now what? Just because you don't get
| a data-race or undefined behavior, it doesn't make setenv/getenv
| usable across threads without any synchronization anyway.
|
| My take on it is that global mutable state is owned by the
| application, library code should never ever mutate it. Applies to
| the environment variables, stdout/stderr, locale.
|
| The application must ensure that when these are mutated they are
| not read concurrently by an other thread. As external libraries
| rarely document the exact conditions when they read environment
| variables, the best is to only update the environment when no
| other thread is running. The absolute best is to avoid mutating
| it altogether.
| lifthrasiir wrote:
| > My understanding is the people responsible for the Unix POSIX
| standards did not like the design of these functions, so they
| refused to implement them.
|
| And herein lies the actual issue: C has a sh*tton of API issues
| in its standard library, and people _really_ want to fix as many
| of them as possible whenever possible, but doing so will
| destabilize the standard so most of them won 't ever be accepted.
| In the case of Annex K many clearly felt that the size-restricted
| API alone is not enough, because it is still easy to
| desynchronize the buffer and the allocated size, and it's a good
| point if we ignore an obvious counterpoint of the lack of safe
| `getenv` alternatives in the standard library at all... I wonder
| about the alternative universe where we have two distinct
| standards for the C language and C standard library so that the
| library standard is much easier to fix and adapt.
| FounderBurr wrote:
| So much cringe, I can't believe this wasn't written as satire.
| ChrisRR wrote:
| Lots of C isn't thread safe, and that's the point. It's supposed
| to be close to the metal, and it's supposed to be small. If you
| want to add additional functionality then that's on the
| user/library. If we add thread safe/memory safe versions of every
| function, then we're moving away from C and into Rust territory.
|
| C is a powerful language, but it's not a language designed to
| hold your hand.
|
| Edit: Especially in this case where the environment is maintained
| by the OS, that puts the onus of safety on the OS to ensure that
| different processes can't modify and read the env simultaneously.
|
| If you're worried about reading and writing the env in different
| threads within the same process, then you need to reconsider your
| design.
| rini17 wrote:
| We're talking about C standard library here. Which has streams
| (struct FILE) that are threadsafe and use locks by default -
| even in single-threaded programs. They could have extended the
| interface to access the environment too. But idk, reasons.
| jstimpfle wrote:
| Contrary to fread(), getenv() does not copy data. It returns
| a pointer to memory that might be invalidated by the
| following setenv(). In other words, it can't be really made
| threadsafe -- it's fundamentally a thread-unsafe API.
|
| What _could_ be done is offering an alternative, thread-safe
| API that takes an RW mutex on get /set, and get reads to a
| user-provided buffer. However, that would be complicated as
| well because the user can't know the size of any envvar, and
| getting the size before reading the var is racy. So maybe
| there needs to be env_lock()/env_unlock() and
| getenv_unlocked()/setenv_unlocked(). Or a version that locks
| and strdups() the var. But that still leaves the problem that
| existing software does not use this API.
|
| And really, why would you set envvars instead of global
| variables? Seriously? A data structure that is a list of
| KEY=VALUE formatted zero-terminated string pointers? Just
| don't do it. Use setenv() only at startup and after fork()
| and before exec(), done.
| lifthrasiir wrote:
| My design is perfect, but libraries have no idea and can and
| will break my perfect design.
|
| Joke aside though, for this reason thread safety is one of
| general requirements for the composability that any large
| enough software project would definitely need.
| another2another wrote:
| I don't think C standard library particularly wanted to be
| problematic to use, but they just weren't prepared for
| multithreaded general programming, and tried to patch it with
| e.g. the _r() version of methods.
|
| An example where I recently made a change is localtime() where
| I changed to use the _r() variant after wrongly just calling
| localtime(), but only later realised they were using a global
| per-process buffer which _might_ produce wrong results in my
| logging code (e.g. if another thread calls localtime then my
| time buffer would be updated after I populated it 2 seconds
| prior).
|
| Now the clib could have made a per thread TLS slot for each
| threads' time values without too much overhead, which would
| have automatically fixed any careless uses of localtime() in
| multiple threads, but instead opted for the separate thread
| safe function calls.
| SAI_Peregrinus wrote:
| C is not supposed to be close to "the metal" unless that's a
| single-core unified-memory processor. It's supposed to be
| maximally portable while retaining reasonable performance, and
| it does that very well.
| andrewaylett wrote:
| > It's supposed to be close to the metal, and it's supposed to
| be small.
|
| Unless you're developing for the PDP-11, I'm afraid I have some
| bad news for you.
|
| We absolutely want -- and occasionally get -- safe versions of
| unsafe functions, like strlcpy. And we rely on _quite a lot_ of
| code to maintain the appearance that C is close to the metal.
| 0xbadcafebee wrote:
| Much ado about nothing. If you have to set environment, do it at
| the beginning of a program, before threading. When you execute an
| application, pass a new environment, without setting it. There's
| really no need to set environment from threads.
|
| The only use case where this bug happens seems to be threaded
| programs that load libraries _after threading has initialized_ ,
| and want to configure those libraries with environment variables,
| that the user/parent program hasn't specified, rather than
| calling their APIs with specific arguments. If a library provides
| no way to set some option other than environment variables, their
| API is simply incomplete and needs fixing. An incomplete library
| is not a good enough reason to amend the C standard.
| dahfizz wrote:
| > We should apparently read every function's specification
| carefully.... These are unrealistic assumptions in modern
| software.
|
| Lol
| ewst wrote:
| why are you calling setenv in threads?
| amelius wrote:
| Possibly because some library they are using calls getenv.
| whalesalad wrote:
| Environment is considered an immutable space in all the systems I
| build. It's set before any program code / processes are even
| started.
| Joker_vD wrote:
| And to re-iterate my point from another thread on setenv, no,
| "just don't call setenv() after creating threads" is not a
| solution because even if the code _you_ wrote may be single-
| threaded, your application as a whole is not composed entirely of
| code written by you: the moment you link against _any_ 3rd-party
| library, you program can have arbitrarily many threads before it
| even reaches main().
| account42 wrote:
| > And to re-iterate my point from another thread on setenv, no,
| "just don't call setenv() after creating threads" is not a
| solution
|
| Yes it is.
|
| > the moment you link against any 3rd-party library, you
| program can have arbitrarily many threads before it even
| reaches main().
|
| So don't call setenv after you load any libraries. If that
| means you can't call setenv at all for your linking setup then
| so be it.
| grayhatter wrote:
| > the moment you link against any 3rd-party library, you
| program can have arbitrarily many threads before it even
| reaches main().
|
| what?! name one library that does this?
| aidenn0 wrote:
| I think the C++ version of the zeromq library could
| initialize a context in a static initializer, and zmq
| contexts involve creating a thread.
| grayhatter wrote:
| https://zeromq.org/get-started/?language=cpp#
|
| there's 5 listed for c++ which one creates threads before
| main()?
| aidenn0 wrote:
| It's been a long time, so I might be wrong, but with
| zmqpp[1] make a global zmqpp::context and I believe it
| will create threads before main().
|
| 1: https://github.com/zeromq/zmqpp
| slaymaker1907 wrote:
| Windows will have a few threads for doing IO at program
| startup.
| egberts1 wrote:
| I am only more surprised that we have not inserted a new #ifdef
| UNSAFE_THREAD added to stdhdr.h, so we can wrap setenv(), et. al.
| Const-me wrote:
| Interestingly, Windows developers made much better choices back
| in the 1990-s.
|
| GetEnvironmentStrings() API comes with the
| FreeEnvironmentStrings() counterpart. Whoever calls
| GetEnvironmentStrings() to get the entire environment is then
| responsible to call FreeEnvironmentStrings(), which allows
| thread-safe GetEnvironmentStrings() API.
|
| GetEnvironmentVariable() API is even simpler, it doesn't return a
| pointer, instead it fills a caller-provided buffer.
| NekkoDroid wrote:
| GetEnvironmentVariable() also has a subtle problem from what I
| can see: You first check how big the buffer has to be and in
| the second call you actually fill out the buffer, which allows
| for the variable to change between those 2 calls (I guess it's
| a variant of the time-of-check time-of-use race condition).
|
| There is getenv_s that doesn't have this problem to my
| knowledge, but it also doesn't exactly allow you to control the
| allocation of the memory (how important that is in cases like
| this is a different question)
| Const-me wrote:
| It seems the getenv_s and GetEnvironmentVariable API
| functions are 100% equivalent; the API is only slightly
| different.
|
| When you don't know the length of the variable you query, you
| guessed wrong on the first attempt, and other threads are
| changing the environment in the background, you might need 3
| of even more calls to either function (passing longer output
| buffers) to successfully retrieve the complete variable.
| NekkoDroid wrote:
| You are right, I mixed up my functions. What I actually
| meant was _dupenv_s
| InfiniteRand wrote:
| Beyond availability, anyone have issues with getenv_s?
| account42 wrote:
| > Since many libraries are configured through environment
| variables, a program may need to change these variables to
| configure the libraries it uses. This is common at application
| startup. This causes programs to need to call setenv(). Given
| this issue, it seems like libraries should also provide a way to
| explicitly configure any settings, and avoid using environment
| variables.
|
| This is the only correct solution. Even threadsafe, setenv
| doesn't guarantee anything about when the variable will take
| effect. There is no way for consumers to tell be notifided of a
| changed variable. For that you need guarantees from the library
| and at that point the library can just as well provide a better
| configuration interface. Keep environment variables static for
| the process lifetime and there are no issues.
| cryptonector wrote:
| I'm glad TFA mentions Solaris/Illumos' implementation, so I don't
| have to.
| cryptonector wrote:
| I'm glad TFA mentions Solaris/Illumos' implementation, so I don't
| have to.
|
| Java doesn't allow one to set environment variables for the
| running process, but it does allow setting env vars for processes
| being spawned. It would be better if all C libraries did what
| Illumos' does.
| kibwen wrote:
| Eyra is a new implementation of libc in Rust that addresses this
| in its default configuration:
|
| _" Eyra solves this by having setenv etc. just leak the old
| memory. That ensures that it stays valid for as long as any
| thread needs it. Granted, leaking isn't great, and Eyra makes it
| configurable with the threadsafe-setenv cargo feature, so it can
| be disabled in favor of the thread-unsafe implementation. "_
|
| https://blog.sunfishcode.online/eyra-does-the-impossible/
| cryptonector wrote:
| It was never impossible. As TFA notes, Solaris/Illumos got it
| right 15 years ago.
| StillBored wrote:
| The problem isn't setenv() so much as getenv returns raw pointers
| to the data structure that is potentially being manipulated.
|
| But this is/was strictly a glibc/Linux bug because maintainers in
| the past didn't want to improve the situation (read add locks
| which weren't 100% reliable) nor add thread safe version of the
| calls: ex getenv_s/getenv_r as pretty much every other POSIX
| compliant system has done.
|
| And so the situation I hit many years ago was a proprietary
| library doing setenv's before fork() has now been fixed (and me
| calling those routines from multiple threads), and setenv() on
| linux/glibc is now working if it's built with locking:
|
| https://sourceware.org/git/?p=glibc.git;a=blob;f=stdlib/sete...
|
| So the remaining issue is making the environment thread safe,
| which means adding a getenv_r/s call and assuring its being used
| everywhere, which is probably a more complex problem than tossing
| the setenv() lock. But then in the setenv/fork case the forked
| process is a crapshoot whether it gets the "right" environment.
| In my case above it didn't really matter because the library was
| doing the equivalent of `export YOURACHILD=1` so the value wasn't
| being changed from invocation to invocation.
|
| But there are dozens and dozens of other similar gochas in the
| spec, where error conditions or threading races exist and aren't
| noticeable until one understands how it is being implemented.
| There isn't a way to fix it with the library itself because many
| of the calls need more stored context space (ala win32 handles).
| So one ends up doing things like building serialization locks
| into the application to assure certain subsets of the posix/c/etc
| libraries aren't being called in parallel. (in this case
| getenv/setenv/fork).
| dpc_01234 wrote:
| We need to fork libc/posix, name it like `libc2` etc. fix
| versioning and start making changes, while allowing legacy
| software deal with the old cruft.
|
| It never gonna happen if we're waiting for some committee to deal
| with it, because they will be too afraid of breaking backward
| compatibility.
| leoh wrote:
| Yeah, it's lame. Linters and static analyzers should probably
| warn in addition to updating as much living documentation as
| possible.
|
| And, idk, suggest something like a method with a singleton mutex
| for getting/setting?
| worik wrote:
| > We should apparently read every function's specification
| carefully,
|
| Yes.
|
| You should.
| zare_st wrote:
| I don't see the problem here. I don't see why you need setenv.
| Please tell me why it's used in conjunction with pthreads. I
| think whoever is doing this is designing their software under
| wrong assumptions.
|
| Can you read/write to same fd socket across threads? No? So
| what's the issue then?
| Groxx wrote:
| Is anyone aware of a language that simply does not have globals
| like this? Or globals at all? The more I deal with other people's
| code, the more I want one.
|
| Obviously there are some semantic-globals (ports, env, main
| thread, etc) that are unavoidable, but we have a way to deal with
| that: dependency injection. Allow it in main, everything else has
| zero access unless it is given the instance representing it.
|
| Obviously that would be pretty painful in practice without some
| boilerplate-reducing tools, but... would it be worse _in
| aggregate_? Or would having _real_ control over all this at last
| pay off? I 'm quite curious.
| andrewla wrote:
| A lot of comments in here seem to be saying that "setenv is
| rarely used, so we can make setenv threadsafe even if it is
| costly", but I think that's missing the point.
|
| The thread-safety is in the operators getenv/setenv (and putenv
| and unsetenv). Thread safety has to apply to all operators, and
| the functionality of getenv() (which is by far the most commonly
| used of these operators) is what has to be fixed.
|
| You simply cannot make setenv() thread-safe so long as getenv()
| has its current interface. You need to make getenv() safe first;
| ideally with a getenv_r() call that fills in a user-supplied
| buffer. From that point making the rest of the calls thread safe
| is trivial.
| jcalvinowens wrote:
| This is silly, setenv() isn't reentrant for the same reason that
| getopt() isn't reentrant: there's no valid reason to use it
| except at the very beginning of the program.
|
| The most common misuse I see is changing env before forking a
| child: nobody has to do that, execve() lets you pass arbitrary
| envp to the new process without changing yours.
|
| If you need to change env in threaded tests... frankly I think
| there was probably a better way to do whatever you're doing, but
| you can just declare a global lock and use it. I bet you could
| even LD_PRELOAD a custom setenv() that uses your lock.
|
| Nobody is pointing at concrete problems outside of Rust. Rust is
| just wrong here, sorry, the manpage has said this for a long
| time:
|
| > POSIX.1 does not require setenv() or unsetenv() to be
| reentrant.
|
| I think a more intellectually honest version of this article
| would have been "POSIX should have made setenv() reentrant", not
| "C is buggy": it's not buggy, it obviously complies with the
| standard. There's nothing to "fix", he wants to change the
| standard.
| kibwen wrote:
| _> the manpage has said this for a long time_
|
| Nobody is disputing what the manpage says. What people are
| arguing is that the specification should be improved so as to
| no longer say that. Please stop quoting the Posix docs, as it
| merely broadcasts that one has missed the point here.
| Documenting the behavior does not automatically excuse the
| behavior.
|
| As for Rust, the Rust docs make it clear that the underlying
| mechanism is fraught. Rust is well-acquainted with trying to
| find satisfactory solutions to the unhelpful and nonsensical
| tech stack that has been foisted upon the world by decades of
| worse-is-better laziness. And even if Rust were to mark
| std::env::set_var as unsafe, that doesn't magically fix
| anything; the underlying mechanism is broken, and actually
| fixing it is beyond Rust's control. Only the platforms can fix
| it.
| jcalvinowens wrote:
| > Documenting the behavior does not automatically excuse the
| behavior.
|
| It's a _standard_ , and I'm citing the _standard_. The non-
| reentrant behavior doesn 't need to be "excused", _it is
| correct_!
|
| If a microcontroller used a different bit than you expected
| it to for something, would the documentation need to be
| "excused" for "disagreeing" with you? That's how absurd what
| you're saying here sounds.
|
| > What people are arguing is that the specification should be
| improved
|
| That would be more reasonable, but that's not the argument.
| They're saying the standard is wrong. Standards can't be
| wrong, they're tautological. All that can be wrong is the
| programmer's understanding of them.
| paulddraper wrote:
| If your point is "the standard is correct as judged by the
| standard" your point is accurate and meaningless.
| loevborg wrote:
| You nailed it. The real villain in this story is mutability.
| We're addicted to changing variables in place, which is
| inherently complex - especially so in multithreaded
| environments. Environment variables are clearly best treated as
| immutable. Rust, despite its advances in some areas,
| perpetuates our addiction to mutable variables.
| orange_fritter wrote:
| Every attempt at escaping mutability basically kills the
| language in the mainstream because so much of "real"
| programming is just bit-twiddling that gets too verbose when
| immutability is involved. It's a good question whether Rust
| nudges the world toward functional/declarative spiritual
| purity by placing constraints on mutation. I'm betting that
| No, it doesn't.
| beltsazar wrote:
| > This is silly, setenv() isn't reentrant for the same reason
| that getopt() isn't reentrant: there's no valid reason to use
| it except at the very beginning of the program.
|
| It is not, unless you'd argue that "there's no valid reason to
| use [anything that transitively uses setenv()] except at the
| very beginning of the program." Did you even read the article?
| The author and the GitHub links mentioned provide some examples
| that use setenv() not directly, but transitively.
| jcalvinowens wrote:
| > Did you even read the article?
|
| Of course. I also clicked through and looked at his examples,
| did you?
|
| > The author and the GitHub links mentioned provide some
| examples that use setenv() not directly, but transitively.
|
| The big list at the end of the article? That's absolutely not
| true: they're all read only usecases that prove my point.
| Nobody should be changing any of those in the middle of the
| program. If you disagree, please point out specifically which
| one and explain why, because I don't see it.
| beltsazar wrote:
| Ah yes, silly mistakes--I mixed up getenv and setenv. Those
| examples transitively use getenv, not setenv.
|
| Having said that, my point still stands. You can only
| control what you directly use / don't use. Third-party
| libraries you use might use setenv at times other than "the
| very beginning of the program."
| duped wrote:
| > there's no valid reason to use it except at the very
| beginning of the program.
|
| POSIX doesn't define the "beginning of a program." Nor do you,
| if you're compiling C code. Libraries can (and do) spawn
| threads before main, so it's not even safe to use if you
| restrict yourself to "only the beginning of the program."
|
| > The most common misuse I see is changing env before forking a
| child: nobody has to do that, execve() lets you pass arbitrary
| envp to the new process without changing yours.
|
| Just remember to do it before fork() and not between fork() and
| exec() because you probably want to copy the existing envp and
| allocators usually aren't async signal safe. If you want to be
| sure to be correct, use posix_spawn to create a child process.
|
| --
|
| I think it's fair to say "setenv is buggy" in the sense that
| the POSIX specification for setenv guarantees it to be buggy in
| most programs that think they should be using it. What makes it
| an even bigger footgun is that the path of least resistance for
| people who need/want the behavior is the most difficult to use
| correctly. POSIX is full of shit like this, and "you should
| know better" isn't a good enough excuse.
|
| It's like saying "hey you should have known unless you press
| the doohickey for 5 seconds and turn the chainsaw at 90 degrees
| when you start it, the chain is going to fly off."
| jcalvinowens wrote:
| > Libraries can (and do) spawn threads before main
|
| I don't think any library should be calling setenv(), there's
| always a better way. If you know of a counterexample, please
| share it, I'd like to see it.
|
| > Just remember to do it before fork() and not between fork()
| and exec() because you probably want to copy the existing
| envp and allocators usually aren't async signal safe.
|
| Why would you go to all that trouble? Most implementations
| just use the stack to build the arguments for execve(),
| making all that irrelevant.
|
| > If you want to be sure to be correct, use posix_spawn to
| create a child process.
|
| That's not the purpose of posix_spawn(), it exists to deal
| with vfork-only nommu architectures:
|
| >> [posix_spawn() was] specified by POSIX to provide a
| standardized method of creating new processes on machines
| that lack the capability to support the fork(2) system call.
|
| >> These functions are not meant to replace the fork(2) and
| execve(2) system calls. In fact, they provide only a subset
| of the functionality that can be achieved by using the system
| calls.
|
| Anyway... back to our regularly scheduled programming:
|
| > the POSIX specification for setenv guarantees it to be
| buggy in most programs that think they should be using it.
| [...] POSIX is full of shit like this, and "you should know
| better" isn't a good enough excuse.
|
| I completely disagree.
|
| The C standard library is chock full of non-reentrant APIs.
| Nobody who has read more than ten manpages would reasonably
| assume that _anything_ is reentrant without an explicit
| assurance. Locks are trivial. POSIX expects you to use a lock
| if you want to use setenv() like this.
|
| I also want to point out that (at least on Linux) getenv()
| returns a pointer to the stack! How could anybody with basic
| programming literacy reasonably expect that to be thread
| safe? It's exactly analogous to getopt() and argv.
|
| No, the authors obviously couldn't imagine that $FAANG would
| be building 4GB binaries with 1000+ recursive library
| dependencies, each of which has its own chance to reenact the
| printer scene from Officespace with your shared envp. I think
| that's an organizational problem, not a POSIX problem.
|
| I'm not saying the standard shouldn't change: I would love to
| see argv and envp become immutable. If we're going to change
| it, that's the right move IMHO. But I don't really think
| that's practical...
|
| > It's like saying "hey you should have known unless you
| press the doohickey for 5 seconds and turn the chainsaw at 90
| degrees when you start it, the chain is going to fly off."
|
| I think it's more like saying "we can't stop people from
| getting hurt jaywalking, so we're going to solve the problem
| by legally requiring everybody to wear helmets at all times
| outdoors".
| layer8 wrote:
| > I don't think any library should be calling setenv()
|
| It's already a problem if the library is calling getenv(),
| because this could happen concurrently to the main program
| calling setenv(). The only universally safe solution is to
| not use setenv()/putenv() at all.
|
| Which I think is actually reasonable. But yes it makes
| those functions broken in a multithreaded program.
| zlg_codes wrote:
| That's essentially the problem with any technology, or
| community around said technology, that's primed against some
| target they want to replace.
|
| It has kept me away from Rust for years. If its fans weren't
| such fanboys for disruptive activity like rewriting things in
| Rust for the fuck of it, I might look closer into it. But
| considering where it came from and their politics, it doesn't
| seem like Rust is actually for everyone.
|
| Its angle of picking on C is also rich. If C is so bad, why has
| no major language supplanted it or C++? Anything with high
| importance on performance is written in low level languages,
| unconcerned with pushing a narrative or making BS 'more
| inclusive' which is really another view on affirmative action.
|
| My identity alone makes me unfit for the project.
| mike_hock wrote:
| > If C is so bad, why has no major language supplanted it or
| C++?
|
| Because no one, not even Rust, aims for feature parity with C
| ( _except_ on the language level).
| jcalvinowens wrote:
| > or community around said technology, that's primed against
| some target they want to replace.
|
| It's a vocal minority of the community, in my experience.
| Most people I've personally met who are passionate about Rust
| have a much more reasonable attitude about the whole thing,
| and see it as moving the needle in the right direction rather
| than a fully formed solution.
|
| It also really is interesting: obviously it's not the panacea
| it is sometimes made out to be, but it truly does eliminate a
| class of error. I admit to a bit of stubbornness myself, but
| I'm trying to work with it more.
| JohnFen wrote:
| > It has kept me away from Rust for years.
|
| Yes. The Rust community is what has kept me away from Rust
| for a long, long time. Now I'm learning Rust because it may
| become an important skill to have, but it's despite the
| community. They're very hard to put up with.
| Georgelemental wrote:
| > But considering where it came from and their politics
|
| > unconcerned with pushing a narrative or making BS 'more
| inclusive' which is really another view on affirmative action
|
| For what it's worth: I am conservative, right-wing, as
| opposed to "affirmative action" as it is possible to be--and
| I love Rust. Don't judge the language by the politics of a
| tiny section of its community, judge it on the technical
| merits.
| boring_twenties wrote:
| > The most common misuse I see is changing env before forking a
| child: nobody has to do that, execve() lets you pass arbitrary
| envp to the new process without changing yours.
|
| That's pretty much never what you would want. You want to set a
| single variable while inheriting the rest of the existing
| environment. In order to do that with execve() you would have
| to copy the existing environment first, yuck.
|
| And you wouldn't use setenv() before forking anyway, you would
| do it after forking, in the child, before exec.
| JohnFen wrote:
| getenv/setenv is also part of the library, not the language. If
| it presents a problem for a particular program, it's easy enough
| to implement your own variety that behaves as you need, or to
| wrap the library call in something that provides the needed
| thread safety.
|
| This seems like a bit of a tempest in a teapot to me.
___________________________________________________________________
(page generated 2023-11-20 23:02 UTC)