[HN Gopher] Checked integer arithmetic in the prospect of C23
___________________________________________________________________
Checked integer arithmetic in the prospect of C23
Author : signa11
Score : 50 points
Date : 2022-12-19 15:45 UTC (7 hours ago)
(HTM) web link (gustedt.wordpress.com)
(TXT) w3m dump (gustedt.wordpress.com)
| tinglymintyfrsh wrote:
| tl;dr #include <stdckdint.h>
| bool ckd_add(type1 *result, type2 a, type3 b); bool
| ckd_sub(type1 *result, type2 a, type3 b); bool
| ckd_mul(type1 *result, type2 a, type3 b);
| #include <stdckdint.h> #include <limits.h>
| /* ... */ int x; int a = INT_MAX; int b =
| INT_MAX; if (!chk_add(&x, a, b)) { /*
| error! */ }
|
| Other stuff on the table for C23
|
| - https://thephd.dev/c-the-improvements-june-september-virtual...
|
| - (PDF) https://open-std.org/jtc1/sc22/wg14/www/docs/n3054.pdf
| RustyRussell wrote:
| Let's take this as evidence that the proposal is counter-
| intuitive?
| heywhatupboys wrote:
| > if (!chk_add(&x, a, b)) { > /* error! */ > }
|
| does it return non-zero on success???
| EdSchouten wrote:
| It returns a boolean.
| tinglymintyfrsh wrote:
| They're using a bool, so it violates the old-school C
| paradigm used for library calls. This is more of a macro
| rather than a syscall or standard library function call.
| comex wrote:
| It returns a bool, but true means error, not false.
| comex wrote:
| You have it backwards: it returns true on error.
| olliej wrote:
| This is simply standard using the builtins that the major c++
| compilers already have. It does not remove the absurd "overflow
| is UB" semantics that introduces security bugs.
| gustedt wrote:
| Only that here we are talking about C. But yes, most C
| compilers already seem to have this as builtins.
| olliej wrote:
| haha, I'm so used to reading C++2x I assumed C++ - however
| the problem exists in both :-/
| Someone wrote:
| FTA: _"Their working is quite simple: the arithmetic is as if
| performed in the set of mathematical integers and then the value
| is written to_ result. If it fits, the return value is false. If
| it doesn't fit, the return value is true"*
|
| They also give example code bool add_invalid =
| ckd_add(&result_add, a, b);
|
| I can see that fits with "most of the time, anything positive
| means 'no error'", for example in _malloc_ , _write_ , _read_ or
| _printf_ , but these new functions return bool, not int, and the
| chosen method will require writing a double negation sometimes:
| if(!add_invalid) { ... }
|
| That's not too bad, but if I were to see
| if(!ckd_add(&result_add, a, b)) { ... }
|
| I would expect that to test for failure, not success.
|
| Because of that, I think I would have chosen to return true on
| success, false on failure. I'm curious as to what arguments led
| to the choice made.
| dooglius wrote:
| I find the most readable, unambiguous thing is to define
| explicit macros or constants, e.g.
| if(ckd_add(&result_add, a, b) == CKD_SUCCESS) { ... }
|
| or alternatively,
| if(CKD_SUCCESS(ckd_add(&result_add, a, b))) { ... }
| kevin_thibedeau wrote:
| The return value is the overflow condition so just name it
| CKD_OVERFLOW.
| RustyRussell wrote:
| Yes, it's backwards. And counter-intuitive use of bool :(
|
| They felt fine changing the argument order, why stick with the
| reverse polarity?
| chongli wrote:
| In C, the value 0 is equivalent to false and all nonzero values
| are equivalent to true. It's a convention throughout the C
| standard library to return 0 on success and nonzero when some
| error occurred. The behaviour of the new checked arithmetic
| library is consistent with that convention.
| SAI_Peregrinus wrote:
| And in shell it's convention for programs to return 0 for
| success and nonzero when an error occurred. The issue is that
| in shell, the `true` builtin returns `0` and `false` returns
| `1`, which is the opposite of C's `bool`. And almost every
| other language's Boolean type.
| gustedt wrote:
| Unfortunately there are several error conventions in the C
| standard.
|
| Here, the committee just standardized existing practice,
| namely the gcc builtins. We just adjusted the call sequence
| in putting the pointer parameter for the result first.
| Someone wrote:
| > It's a convention throughout the C standard library to
| return 0 on success and nonzero when some error occurred.
|
| If only it were so simple. _read_ and _write_ , for example,
| return a number less than zero on error and a non-negative
| number on success, and _malloc_ returns zero on error, and
| nonzero on success.
|
| The general rule for early C seems to be "whatever's the best
| way to cram a return value or an error in an int" (probably
| the correct decision for the time)
|
| Also, these new functions return a bool, which, in C23, gets
| integer-converted to zero for _false_ and one for _true_
| (https://en.cppreference.com/w/c/language/bool_constant. C17
| had macros for true and false, with false being zero)
|
| and the reverse, converting to _bool_ similarly has zero fro
| false (https://en.cppreference.com/w/c/language/conversion#Bo
| olean_...):
|
| _"A value of any scalar type (including nullptr_t) (since
| C23) can be implicitly converted to _Bool. The values that
| compare equal to an integer constant expression of value zero
| are converted to 0 (false), all other values are converted to
| 1 (true)."_
|
| (https://en.cppreference.com/w/c/language/bool_constant)
| dahfizz wrote:
| _when the return value is an error code_, zero means
| success and nonzero means failure. Functions like `read`,
| `recv`, etc etc don't just return an error code. They
| return an actual value.
|
| Functions that only return an error code like `stat`,
| `connect`, and the proposed ckd_add, return 0 on success
| and nonzero on error.
| dahfizz wrote:
| The proposal fits my intuition. When the return value is an
| error code, truthy values are an error. int
| rc = func(); if( rc ) { /*handle error*/ }
|
| examples from the stdlib: connect(), stat(), etc. Hell, even
| main is defined to return 0 on success, nonzero on error.
| RustyRussell wrote:
| Not for bool though.
| jacquesm wrote:
| What was the rationale for not simply making it optional to throw
| an exception on overflow?
| gustedt wrote:
| Besides C not having exceptions (that you could catch), the
| point was and is to have a way such that such a call has
| defined behaviour under any circumstances. The return value of
| the functions can even be ignored if the wrap around of the
| overflowing value is what your code expects.
|
| So it is on the programmer to define what happens on error,
| they could ignore, try to back off by computing the high value
| bits, `exit` or `abort`.
| dahfizz wrote:
| C doesn't have exceptions...?
| heywhatupboys wrote:
| kinda does though. floating point exception vectors are a
| thing and settable from most compiler impls
| dahfizz wrote:
| fenv is available in standard C99. There is a GNU extension
| to trap and throw a signal when a fp error is hit
| (feenableexcept). I would definitely argue that signals !=
| exceptions.
| jacquesm wrote:
| The floating point implementation can signal SIGFPE, with a
| code FPE_INTOVF, it would seem to me that that is a suitable
| exception mechanism, it's just that the source isn't the
| floating point unit but the regular CPU.
|
| Signals (kill), signal, raise, sigsetjmp, siglongjmp etc are
| C's exception handling mechanism. It's not as well integrated
| into the language as say a try-catch construct but it works
| well enough for situations like these. See: signal.h and
| setjmp.h
|
| https://en.wikipedia.org/wiki/Signal_(IPC)
| dahfizz wrote:
| Signals are not exceptions. Languages with exceptions still
| have to handle signals separately. Signals are an OS
| construct, and exceptions are a language construct.
|
| To the question of "why not raise a signal on integer
| overflow in C?" - because signals are a terrible way of
| dealing with this. The signal handler doesn't know what
| code caused the overflow, and can't really do anything
| about it. Once the signal handler returns, the code itself
| has no idea it caused an overflow. Signals are a way for
| the OS to send signals to your program and not for control
| flow, after all. That's why `feenableexcept` is a niche
| extension that nobody uses.
|
| The standard way of checking for fp errors is by calling
| `fetestexcept`. Personally I prefer this strategy (doing
| operation, then checking for errors) vs the new proposal
| for ints (checking for potential errors before doing the
| operation). But that is a matter of taste.
| jacquesm wrote:
| Interesting, I've always considered signals to be an
| exception handling mechanism, as in the 'normal flow' of
| a program is interrupted and dealt with - or not -
| through some other mechanism. Learn a new thing every
| day, even after 40 years of programming in C :) Thanks!
|
| https://en.wikipedia.org/wiki/Exception_handling
|
| (which has this bit: "C does not have try-catch exception
| handling, but uses return codes for error checking. The
| setjmp and longjmp standard library functions can be used
| to implement try-catch handling via macros.")
| dahfizz wrote:
| Its all semantics, I guess. You could argue that signals
| can be used to handle exceptional conditions (dereffing
| NULL). But signals are significantly different than what
| other programming languages call "exceptions".
|
| Its the same thing as "run time". Pedantically, crt0
| exists and therefore C has a "runtime". But it is nothing
| like what we refer to as a "runtime" today. The literal
| words are true but the meaning of the words doesn't match
| expectations.
| jacquesm wrote:
| > signals are significantly different than what other
| programming languages call "exceptions"
|
| C is likely considerably older than those 'other
| programming languages' and I'm still stuck in the past
| with my terminology.
| kevin_thibedeau wrote:
| Ada had exceptions in 83 that are analogous to their
| modern incarnation. C was still just a baby then.
| jacquesm wrote:
| Ada was unobtanium for a long time for mere mortals like
| myself. In 1983 I was 18 and had access to a C compiler,
| an Ada compiler would have cost me an arm, a leg and my
| still to be first born, and information about the
| language was pretty much limited to what you could get
| from magazine articles of people that had maybe at some
| point known someone who had seen an Ada compiler in the
| wild.
|
| The only other realistic options outside of
| government/enterprise were Pascal, BASIC and assembler,
| and within the bulk of the work was done in COBOL.
| gpderetta wrote:
| These operations simply return the carry flag, they are
| supposed to map directly to the hardware which normally doesn't
| raise exceptions for integer overflow.
|
| Also they can be useful to implement bigints.
| raphlinus wrote:
| I posted overflow checking of signed integer arithmetic as a
| puzzle yesterday[1]. I got some good responses but none quite as
| minimal wrt number of instructions as my own solution:
| bool add_will_overflow(int32_t a, int32_t b) {
| uint32_t c = (uint32_t)a + (uint32_t)b; return
| (((uint32_t)a ^ c) & ((uint32_t)b ^ c)) >> 31; }
|
| That produces the following assembly (see Godbolt[2]):
| lea edx, [rdi+rsi] mov eax, edi
| xor eax, edx xor esi, edx and
| eax, esi shr eax, 31 ret
|
| In Rust, you can write a.checked_add(b).is_none() which produces
| the following assembly[3]: add edi, esi
| seto al ret
|
| A fun fact about this code: the overflow flag which is set by the
| add instruction and then harvested dates back at least to the
| 8080 (almost 50 years ago) and is not present in vanilla ARM.
| However, Apple Silicon has it as an extension, to make life
| easier for Rosetta 2 binary translation[4]. So when you do get to
| use this shorter code sequence, be thankful of the effort that
| chip designers put in to make it execute efficiently.
|
| I expect the C23 built-in functions will perform as well as Rust
| here, which is a win both for ergonomics (you can't really
| consider the current state of "will a+b overflow" to be
| discoverable) and performance.
|
| [1]: https://mastodon.online/@raph/109535617953722719
|
| [2]: https://godbolt.org/z/17zMsWjYv
|
| [3]: https://rust.godbolt.org/z/36Ta9oP1P
|
| [4]: https://news.ycombinator.com/item?id=33635720
| jcranmer wrote:
| Checked overflow operations are kind of the goto operation for
| "it's easy in assembly, hard in programming languages"--in
| hardware terms, it's usually check a flag, but since flag
| registers are not provided for in high-level languages, it
| becomes a game of try to write it in a pattern that the
| compiler can recognize, which is never a fun game to play. Even
| worse than addition is multiplication. Thankfully, C23 has
| _finally_ added these operations.
|
| Although, recently, I noticed I wanted a case where I wanted
| checked (u32 - u32) -> i32 and (u32 + i32) -> u32 operations,
| which even Rust's standard library doesn't provide. (The use
| case is keeping track of a running delta between two lists of
| u32 values--the delta can go positive or negative, so it has to
| be signed, but the values in the lists can never be negative).
| raphlinus wrote:
| The addition operator just landed in Rust 1.66
| (checked_add_signed[1]), but the subtraction one it looks
| like you'd need to roll your own.
|
| [1]: https://github.com/rust-lang/rust/issues/87840
| TazeTSchnitzel wrote:
| > it becomes a game of try to write it in a pattern that the
| compiler can recognize
|
| Worse, in C or C++ you also need to find a way to do it
| without undefined behaviour. You can't just do the operation
| and see if the result matches expectations...
| dezgeg wrote:
| Hmm, isn't the Apple-specific magic only for parity(PF) and aux
| carry (AF)? aarch64 does have a 'V' flag for signed overflow.
| raphlinus wrote:
| Oops, you're right. Too late to edit, sorry about the
| confusion.
| dooglius wrote:
| This is already present as a builtin in in GNU C (as indicated
| in TFA) and it already results in the optimal code:
| https://godbolt.org/z/qc4zvav7E
| tinglymintyfrsh wrote:
| Also, Hacker's Delight and OpenBSD probably have clever
| solutions for these.
| unwind wrote:
| Really fun, great introduction to the exciting new features.
|
| Which part of the example implemented using `nullptr` instead of
| `NULL`, which is also from the future, though?
| gustedt wrote:
| Yes, `nullptr` will also be in C23.
| pwdisswordfish9 wrote:
| What for, if ((void *)0) is sufficient in C?
| saagarjha wrote:
| Apparently implementations don't use this or something like
| that.
| RcouF1uZ4gsC wrote:
| One major concern with these type of safety functions is that you
| have to explicitly opt in. If you are actually thinking about the
| need to call these function, you are already thinking about
| overflow.
| jdhdjdbdjdbd wrote:
| And that's why you use C in the first place?
| jacquesm wrote:
| > If you are actually thinking about the need to call these
| function, you are already thinking about overflow.
|
| As you should, for any datatype.
| ash_gti wrote:
| I know most of these are compiler intrinsically but it's good to
| have them standardized.
| [deleted]
___________________________________________________________________
(page generated 2022-12-19 23:01 UTC)