Post AyEdAmsXTZbb9xWhbE by wolf480pl@mstdn.io
(DIR) More posts by wolf480pl@mstdn.io
(DIR) Post #AyEdAlfjxfm1PyWy0m by zwol@masto.hackers.town
2025-09-15T14:11:58Z
2 likes, 0 repeats
A tiny hill I will die on: It is *wrong* to print out numeric errno codes. If you feel it is necessary to print a mysterious code in addition to the human-readable message you get from strerror(), print the Exxx name. Yeah, C doesn't give you any good way to do that. Maybe you don't need to print a mysterious code at all?You know what you *should* print, though? The name! of the thing! you couldn't do something with! (usually but not always a filename)
(DIR) Post #AyEdAmsXTZbb9xWhbE by wolf480pl@mstdn.io
2025-09-15T15:53:40Z
0 likes, 0 repeats
@zwoltbh? I prefer number and strerror to just strerror. Half of the time this is an error the program's author did not anticipate, and I need to look up the individual syscall and its error codes anyway to know why this particular error happened.Also, another thing that's helpful is the name of the syscall / libc function that failed. Ideally with some other unique words around it so that I can find the call site on grep.app or Debian code search.
(DIR) Post #AyEdAtVOq0FtiDkp0K by zwol@masto.hackers.town
2025-09-15T14:17:20Z
0 likes, 0 repeats
this grump brought to you this morning bylet mut msg = ioerr.to_string();if let Some(cut) = msg.rfind(" (os error ") { msg.truncate(cut);}
(DIR) Post #AyEe66zrLlBa3qHjQe by zwol@masto.hackers.town
2025-09-15T16:04:04Z
0 likes, 0 repeats
@wolf480pl The strerror strings are in a perfect 1:1 correspondence with the numeric codes. The numeric codes add no additional information to the error message; they're visual clutter at best.Worse, while the _strings_ are nigh-universally the same across Unixes, the _numbers_ are _not_. If you tell me "error code 13" instead of a strerror string, I have to ask you to identify your kernel, CPU, and in some cases exact ABI revision before I can do anything with that.
(DIR) Post #AyEeeCuzq9OiGwGhLU by zwol@masto.hackers.town
2025-09-15T16:06:58Z
1 likes, 0 repeats
@wolf480pl Now, the relationship between Exxxx codes and strerror strings can be obscure (why is "Permission denied" the string for EACCES rather than EPERM?) and the E-codes are what's in the manpages, so, if the C library gave a good way to do it, I _would_ fully support printing the E-codes alongside the strerror strings. But the numbers are an implementation detail that should not be exposed to humans working at a level higher than machine code.
(DIR) Post #AyEeeE9ZFSeC6Q5qhE by zwol@masto.hackers.town
2025-09-15T16:07:25Z
0 likes, 0 repeats
@wolf480pl And you're absolutely right about printing the name of the syscall that failed.
(DIR) Post #AyEeeFAJUAq1F1RxUu by wolf480pl@mstdn.io
2025-09-15T16:10:12Z
0 likes, 0 repeats
@zwol the strings differ by locale.You can look up the symbolic name with errno(1) like `errno -s "left on device"` *if* you have the same locale as the program that generated the error (or override it with env vars which is a pain). And with a lot of typing, and possible typos.With a number, you can just `errno 28`, either on the machine that generated the error, or on another machine with the same uname. It's much quicker in my experience.
(DIR) Post #AyEekIxuGqKIlQXMMS by zwol@masto.hackers.town
2025-09-15T16:11:19Z
0 likes, 0 repeats
@wolf480pl If I had written errno(1) it would not accept the numbers, just the Exxx codes and the strings. That's how strongly I feel about this.
(DIR) Post #AyEenQmbf2YhsYk4gK by wolf480pl@mstdn.io
2025-09-15T16:11:55Z
0 likes, 0 repeats
@zwol so you'd go out of your way to make my work more painful?
(DIR) Post #AyEerxVHwqfgWDniCW by zwol@masto.hackers.town
2025-09-15T16:12:42Z
0 likes, 0 repeats
@wolf480pl You shouldn't ever see or need to know those numbers in the first place! Every time they get thrown at you is a bug! That's my whole point!
(DIR) Post #AyEf9F7FiePO46neZE by wolf480pl@mstdn.io
2025-09-15T16:15:51Z
0 likes, 0 repeats
@zwol strerror not including the Exxxx a more important bug, fix that one first (probably too late to do that)Also, consider the following:What if the (non-libc) function being called sometimes returns an errno, and sometimes an integer that is not an errno, and the caller calls perror / strerror on it either way?Obviously that's buggy code, but if it wasn't buggy code, I wouldn't be debugging it.But now knowing the number can be very helpful.
(DIR) Post #AyEfQJe883Lv2SJqhU by wolf480pl@mstdn.io
2025-09-15T16:18:56Z
0 likes, 0 repeats
@zwol admittedly, this is rare. It happened to me between 0 and 2 times.But like, the only harm from seeing an implementation-detail number like this would be if people started assuming they'll always be those particular numbers.So I'm guessing you've seen some people do that?
(DIR) Post #AyEhnQCW7nnrw6Y5TM by zwol@masto.hackers.town
2025-09-15T16:45:30Z
0 likes, 0 repeats
@wolf480pl Exactly. More times than I can count, in fact. It's a really easy beginner mistake because gdb prints errno like any other int, and beginners usually learn about printf("%d") before they learn about strerror and perror.There's less variation nowadays, as the old proprietary Unixes and the odder CPU ISAs die out, but even today I wouldn't feel safe assuming that they are the same between NetBSD and FreeBSD on the same CPU, let alone any more distant relations.
(DIR) Post #AyEiFMZnC7Xsnql8DI by wolf480pl@mstdn.io
2025-09-15T16:50:34Z
0 likes, 0 repeats
@zwol I wouldn't feel safe assuming they're the same between Linux on x86 and Linux on ARM.And yeah, printing only the number is bad.But I don't see how removing support for numbers from errno(1) changes anything other than making life difficult for every downstream user of such bad software.Beginners don't know about errno(1) anyway.
(DIR) Post #AyEutbs4wk6nGyGnmT by zwol@masto.hackers.town
2025-09-15T19:12:17Z
0 likes, 0 repeats
@wolf480pl errno(1) shouldn't accept numbers all by themselves because the numbers are meaningless without identifying the system that generated them. `errno x86_64-unknown-linux-gnu 23` has enough information to print the same thing as `errno ENFILE`, but `errno 23` all by itself shouldn't be blindly assuming the 23 came from the system it's running on; that reinforces the misconception that the numbers are consistent across systems.
(DIR) Post #AyEvCLLebQbzWu2JZA by wolf480pl@mstdn.io
2025-09-15T19:15:42Z
0 likes, 0 repeats
@zwol gcc should not allow compiling files without specifying a triplet, because a source file by itself is not enough information to generate a binary. `gcc -c x86_64-unknown-linux-gnu main.c` has enough information to build main.o, but `gcc -c main.c` all by itself shouldn't be assuming the binary should be built for the system it's running on; that reinforces the misconception that instruction sets are consistent across systems.
(DIR) Post #AyEyo9NXqx07H1dkzA by zwol@masto.hackers.town
2025-09-15T19:56:04Z
0 likes, 0 repeats
@wolf480pl Plan 9 did as you suggest and it may even have been a good idea. But there's an important difference: nobody copies an entire object file into a 2am email to their instructor begging for help. Misconverted errno codes, however, yes they do.
(DIR) Post #AyEyxGpPWe0Iz41UMS by wolf480pl@mstdn.io
2025-09-15T19:57:47Z
0 likes, 0 repeats
@zwol > nobody copies an object file to another computerBecause they've tried before and found out that it doesn't work
(DIR) Post #AyEzzXYsNlaxPlw3LU by zwol@masto.hackers.town
2025-09-15T20:09:23Z
0 likes, 0 repeats
@wolf480pl but when they copy the numeric errno code out of gdb on the embedded dev board and paste it into errno(1) on the host system and it silently gives them the wrong answer, they *don't* learn that that was the wrong answer until, in the worst case, several more hours of frustration. You're reinforcing my position with this line of reasoning.
(DIR) Post #AyHHAA4w7PVLJ2V9ge by lanodan@queer.hacktivis.me
2025-09-16T22:31:07.089717Z
0 likes, 0 repeats
@alanc @zwol @jmc Well those tests would be broken across systems anyway since the error strings are non-standard.In fact glibc and musl have different ones. strerror(0) == "No error information" in musl being probably the best one against glibc's strerror(0) == "Success"
(DIR) Post #AyHIOKFVCZgarDlGjY by lanodan@queer.hacktivis.me
2025-09-16T22:44:52.693943Z
0 likes, 0 repeats
@alanc @jmc @zwol Yeah, that bug makes a lot of sense. I think I'd go for something like "[ENOENT #2] No such file or directory" to keep things compact and not too strange looking when using format strings with strerror output potentially in the middle of a message (often seen in non-C languages) but that's a somewhat personal taste.
(DIR) Post #AyHTacibSS8z6sRKNM by lanodan@queer.hacktivis.me
2025-09-17T00:50:20.245687Z
0 likes, 0 repeats
@alanc @zwol @jmc FreeBSD 14.0 and current also seems to be using "Floating point exception" (lib/libc/gen/siglist.c)And musl uses "Arithmetic exception"So I guess there's a split.
(DIR) Post #AyHZdcD6Gh5txW3ZqK by zwol@masto.hackers.town
2025-09-17T01:44:38Z
1 likes, 0 repeats
@lanodan @alanc @jmc Yeah, my brief time teaching was much later and all the students were on regular Linux (glibc, not musl). Though, if the Sun clade has had "Arithmetic exception" forever and musl uses it too, that's all the excuse I need to go change glibc myself :)