[HN Gopher] (Don't) crank up the warnings to 11
___________________________________________________________________
(Don't) crank up the warnings to 11
Author : jjgreen
Score : 47 points
Date : 2023-03-15 21:48 UTC (1 hours ago)
(HTM) web link (lemire.me)
(TXT) w3m dump (lemire.me)
| ok123456 wrote:
| Compiler generated warnings are generally pretty high quality.
| Dynamic analysis tools generate high quality warnings. Static
| analysis tools that work on a significant amount of the AST seem
| to generally give good enough warnings.
|
| It's the shallow pattern based, heuristic tools seem to generate
| a lot of noise.
| teddyh wrote:
| > _However, I fear that it is part of a larger trend where people
| come to rely more or more on overbearing static analysis to judge
| code quality. The more warnings, the better, they think._
|
| Don't get me started on security scanners which basically report
| all "open" ports as a security issue.
| 2h wrote:
| Or, maybe stop doing pointer arithmetic? We have had some pretty
| good new languages appear in the last 7 years. I know people are
| averse to change, me included. But you may want to ask yourself
| if continuing to use a 50 year old language is the right answer.
| Snetry wrote:
| Oh how correct you are.
|
| Let us throw away all the existing code written in languages
| made in a different age and use languages that are completely
| binary incompatible just because we can.
| eklitzke wrote:
| This example isn't doing pointer arithmetic.
| jeffbee wrote:
| Yes, it is. const __m512i *ptr1 = (const
| __m512i*)container1->words;
|
| Then later: __m512i r1 =
| _mm512_loadu_si512(ptr1+i);
| zaptheimpaler wrote:
| This is low level code written in a library to make full use of
| AVX-512.. which is inherently a processor native feature that
| has to be written in a language close to assembly because the
| processor only understands machine code.. there is no way to
| avoid assembly or the pointer arithmetic. Someone has to do the
| pointer arithmetic so the higher level languages and libraries
| don't have to.
| loup-vaillant wrote:
| Well, then the proper response from Github would be something
| like: WARNING: HIGH-TO-CRITICAL SEVERITY:
| You are using C. C is not memory safe, and likely to
| expose your users to critical vulnerabilities.
|
| Not sure it would fly.
| [deleted]
| ChrisMarshallNY wrote:
| _> Training young programmers to avoid non-issues may make them
| less productive._
|
| I've found that training _good habits_ , pays off in spades.
|
| Once a technique becomes _habit_ , its overhead basically drops
| to 0.
|
| May take bit to get there, but once it's ingrained, it is cost-
| free.
|
| An example is when I trained myself to switch from Whitesmiths
| indenting, to K&R indenting (when I switched from C++, to Swift).
|
| It took a few weeks of discomfort, but it is totally ingrained,
| these days.
| adastra22 wrote:
| I have turned off all GitHub scanner notifications. There is just
| too much noise and my time is limited.
| bragr wrote:
| I don't know anything about the code quality of the author, but
| in my experience, the ones that complain about static checking
| and other guard rails the most, are the ones that need it the
| most, because in their hubris of thinking they won't make those
| kinds of errors, they make the most egregious forms of those
| errors. e.g. smart enough take pot shots about the possible
| downsides of the process, but not smart enough (or self aware
| enough) to see their own fallibility.
| jjgreen wrote:
| There's a link to Professor Lemire's bio on the linked article:
| https://lemire.me/en/
| [deleted]
| [deleted]
| Someone wrote:
| * * *
| stusmall wrote:
| I remember an argument with an engineer about static code
| analysis years ago. Without a trace of irony, he honestly said
| "I know it's high quality code. I wrote it. What could a
| program tell me about it that I don't already know?" You can
| guess how much of a nightmare his code was to work with.
|
| This isn't a comment on the original article or the author at
| all. Just my experience with a certain type of engineer.
| omoikane wrote:
| > Modifying code to fix a non-issue has a non-zero chance of
| introducing a real bug
|
| One example of such bug, where fixing a purify warning about
| uninitialized memory crippled OpenSSL:
|
| https://www.schneier.com/blog/archives/2008/05/random_number...
|
| via: https://xkcd.com/424/
|
| I think it's generally preferred to enable most warnings but have
| some annotations to silence false positives, preferably with
| sufficient comments accompanying those annotations to help the
| code reviewers.
| jeffbee wrote:
| The PR in question seems to be
| https://github.com/RoaringBitmap/CRoaring/pull/446
|
| I agree that the GitHub analyses are worse than useless, but I
| also think it highlights the friction between intrinsics and the
| program in which they appear. There probably is a better model.
| zX41ZdbW wrote:
| The order should be as follows:
|
| - automated testing;
|
| - -Wall -Wextra, and -Weverything with some exceptions;
|
| - address, memory, thread, and undefined sanitizers;
|
| - fuzzing (at least some sort of), coverage-guided fuzzing with
| sanitizers;
|
| - static analysis, starting with clang-tidy;
|
| - other static analyzers, such as Coverity, CodeQL - have the
| least priority.
|
| An example of how everything of the above, everywhere, all at
| once is applied in public can be seen in the ClickHouse
| repository: https://github.com/ClickHouse/ClickHouse in the per-
| commit checks. But keep in mind that the checks are really heavy
| - run for multiple hours, and we have to apply kludges to support
| GitHub Actions with spot instances...
|
| Also the dashboard can be checked here:
| https://aretestsgreenyet.com/
| wahern wrote:
| C and C++ developers who have been around long enough
| understand why clang _specifically_ says _not_ to regularly use
| -Weverything.
|
| Historically even -Wextra has resulted in spurious warnings for
| perfectly valid and clean code, as most if not all -Wextra
| warnings are _heuristics_. -Weverything is also all heuristics,
| but even more volatile and debatable. Any worthwhile
| -Weverything heuristic will eventually move to -Wextra.
|
| So, sure, use -Weverything on occasion to see if there's
| anything interesting to take note of. But if you include it by
| default you're inviting fatigue and unnecessary rewrites, which
| may be just as likely to result in the introduction of bugs as
| to fix any bugs.
| tmtvl wrote:
| I like adding -pedantic to -Wall and -Wextra, although getting
| other people's software to compile with that is a nightmare.
| aidenn0 wrote:
| Right thinking people can also disagree on what is a false-
| positive. I remember a dialog between two people about one such
| thing. The summary is:
|
| 1. No improper behavior occurs currently because of the item
| flagged
|
| 2. However, reasonable changes to other parts of the code _could_
| cause the problem to occur
|
| 3. There is an obvious way of rewriting that avoids #2.
|
| One person argued it was a false positive because of #1, while
| the other argued it was a true positive because #2 and #3
| combined means you definitely want to change the code...
| jgrahamc wrote:
| I prefer maximum warnings and use of #pragma to disable a warning
| at a specific point when I know the warning is useless/wrong.
| That way I can get to a totally clean build with zero warnings
| output. The last significant piece of C I wrote:
| # Turn on absolutely all warnings and turn them into errors
| CFLAGS += -g -Wall -Wextra -Wno-unused-parameter -Werror
|
| https://github.com/cloudflare/keyless/blob/master/Makefile#L...
|
| The reason I do this is if I leave some warnings in I can get
| complacent "oh, there's always a few warnings" or "that's
| expected" and then miss something. So, I like all the warnings
| and warnings treated as errors.
| Snetry wrote:
| well, clearly not "maximum warnings" since you are missing a
| lot of them. -pedantic -Wunreachable-code -Wcast-qual -Wswitch-
| default -Wconversion -Wshadow -Wstrict-aliasing -Winit-self
| -Wcast-align -Wpointer-arith -Wmissing-declarations -Wmissing-
| include-dirs -Wno-unused-parameter -Wuninitialized and thats
| not even all that are useful.
|
| Also, don't use -Werror if you can Warnings will change based
| on compiler, compiler version and a countless number of other
| factors that make working with code that turns them into errors
| a pain.
| loup-vaillant wrote:
| Man, you should try Clang's `-Weverything` it's... well you
| know what, I'll try that on my stupidly high-quality
| cryptographic code: $ make "CC=clang -std=c99"
| "CFLAGS=-O3 -Weverything" clang -std=c99 -O3 -Weverything
| -I src -I src/optional -fPIC -c -o lib/monocypher.o
| src/monocypher.c src/monocypher.c:262:6: warning: mixing
| declarations and code is incompatible with standards before C99
| [-Wdeclaration-after-statement] u8
| tmp[64]; ^
| src/monocypher.c:231:9: warning: mixing declarations and code
| is incompatible with standards before C99 [-Wdeclaration-after-
| statement] u32 pool[16];
| ^ src/monocypher.c:337:12: warning: mixing declarations
| and code is incompatible with standards before C99
| [-Wdeclaration-after-statement] const u64 s0 =
| ctx->h[0] + (u64)s[0]; // s0 <= 1_fffffffe
| ^ In file included from src/monocypher.c:54:
| src/monocypher.h:289:9: warning: padding size of
| 'crypto_poly1305_ctx' with 4 bytes to alignment boundary
| [-Wpadded] typedef struct { ^
| src/monocypher.c:410:9: warning: mixing declarations and code
| is incompatible with standards before C99 [-Wdeclaration-after-
| statement] size_t nb_blocks = message_size >> 4;
| ^ src/monocypher.c:437:6: warning: mixing declarations
| and code is incompatible with standards before C99
| [-Wdeclaration-after-statement] u64 c = 5;
| ^ src/monocypher.c:498:6: warning: mixing declarations
| and code is incompatible with standards before C99
| [-Wdeclaration-after-statement] u64 v0 =
| ctx->hash[0]; u64 v8 = iv[0]; ^
| src/monocypher.c:621:10: warning: mixing declarations and code
| is incompatible with standards before C99 [-Wdeclaration-after-
| statement] size_t nb_words = message_size
| >> 3; ^
| src/monocypher.c:600:9: warning: mixing declarations and code
| is incompatible with standards before C99 [-Wdeclaration-after-
| statement] size_t nb_blocks = message_size >> 7;
| ^ src/monocypher.c:640:9: warning: mixing declarations
| and code is incompatible with standards before C99
| [-Wdeclaration-after-statement] size_t hash_size
| = MIN(ctx->hash_size, 64); ^
| src/monocypher.c:786:6: warning: mixing declarations and code
| is incompatible with standards before C99 [-Wdeclaration-after-
| statement] u8 hash_area[1024];
| ^ src/monocypher.c:874:10: warning: mixing declarations
| and code is incompatible with standards before C99
| [-Wdeclaration-after-statement]
| u32 next_slice = ((slice + 1) % 4) * segment_size;
| ^ src/monocypher.c:801:6: warning: mixing declarations
| and code is incompatible with standards before C99
| [-Wdeclaration-after-statement] int constant_time
| = config.algorithm != CRYPTO_ARGON2_D; ^
| src/monocypher.c:1170:6: warning: mixing declarations and code
| is incompatible with standards before C99 [-Wdeclaration-after-
| statement] i32 q = (19 * t[9] + (((i32) 1) <<
| 24)) >> 25; ^ src/monocypher.c:1337:5:
| warning: mixing declarations and code is incompatible with
| standards before C99 [-Wdeclaration-after-statement]
| u8 isodd = s[0] & 1; ^
| src/monocypher.c:1349:6: warning: mixing declarations and code
| is incompatible with standards before C99 [-Wdeclaration-after-
| statement] int isdifferent = crypto_verify32(fs,
| gs); ^ src/monocypher.c:1433:7:
| warning: mixing declarations and code is incompatible with
| standards before C99 [-Wdeclaration-after-statement]
| i32 *quartic = t1; ^
| src/monocypher.c:1500:5: warning: mixing declarations and code
| is incompatible with standards before C99 [-Wdeclaration-after-
| statement] fe x2, z2, x3, z3, t0, t1;
| ^ src/monocypher.c:1650:6: warning: mixing declarations
| and code is incompatible with standards before C99
| [-Wdeclaration-after-statement] u64 carry = 1;
| ^ src/monocypher.c:1676:6: warning: mixing declarations
| and code is incompatible with standards before C99
| [-Wdeclaration-after-statement] u32 B[8];
| load32_le_buf(B, b, 8); ^
| src/monocypher.c:1756:6: warning: mixing declarations and code
| is incompatible with standards before C99 [-Wdeclaration-after-
| statement] int is_square = invsqrt(h->X, h->X);
| ^ src/monocypher.c:1956:8: warning: mixing declarations
| and code is incompatible with standards before C99
| [-Wdeclaration-after-statement]
| int lsb = v & (~v + 1); // smallest bit of v
| ^ src/monocypher.c:2015:7: warning: mixing declarations
| and code is incompatible with standards before C99
| [-Wdeclaration-after-statement] int
| h_digit = slide_step(&h_slide, P_W_WIDTH, i, h);
| ^ src/monocypher.c:1994:12: warning: mixing declarations
| and code is incompatible with standards before C99
| [-Wdeclaration-after-statement] ge_cached
| lutA[P_W_SIZE]; ^
| src/monocypher.c:2185:5: warning: mixing declarations and code
| is incompatible with standards before C99 [-Wdeclaration-after-
| statement] fe tmp_a, tmp_b; // temporaries for
| addition ^ src/monocypher.c:2540:5:
| warning: mixing declarations and code is incompatible with
| standards before C99 [-Wdeclaration-after-statement]
| fe t1, t2; ^ src/monocypher.c:2641:6:
| warning: mixing declarations and code is incompatible with
| standards before C99 [-Wdeclaration-after-statement]
| int is_square = invsqrt(t1, t1); ^
| src/monocypher.c:2696:6: warning: mixing declarations and code
| is incompatible with standards before C99 [-Wdeclaration-after-
| statement] int is_square = invsqrt(t3, t3); // t3
| = sqrt(-1 / non_square * u * (u+A)) ^
| src/monocypher.c:2775:6: warning: mixing declarations and code
| is incompatible with standards before C99 [-Wdeclaration-after-
| statement] u32 t[16] = {0}; ^
| src/monocypher.c:2815:6: warning: mixing declarations and code
| is incompatible with standards before C99 [-Wdeclaration-after-
| statement] u32 m_scl[8]; ^
| src/monocypher.c:2870:22: warning: mixing declarations and code
| is incompatible with standards before C99 [-Wdeclaration-after-
| statement] crypto_poly1305_ctx poly_ctx;
| // auto wiped... ^
| src/monocypher.c:2925:6: warning: mixing declarations and code
| is incompatible with standards before C99 [-Wdeclaration-after-
| statement] int mismatch = crypto_verify16(mac,
| real_mac); ^ src/monocypher.c:2953:6:
| warning: mixing declarations and code is incompatible with
| standards before C99 [-Wdeclaration-after-statement]
| int mismatch = crypto_aead_read(&ctx, plain_text, mac, ad,
| ad_size, ^ 33 warnings generated.
| clang -std=c99 -O3 -Weverything -I src -I src/optional -fPIC -c
| -o lib/monocypher-ed25519.o src/optional/monocypher-ed25519.c
| src/optional/monocypher-ed25519.c:165:8: warning: mixing
| declarations and code is incompatible with standards before C99
| [-Wdeclaration-after-statement]
| u64 in = K[i16 + j] + ctx->input[j];
| ^ src/optional/monocypher-ed25519.c:158:9: warning:
| mixing declarations and code is incompatible with standards
| before C99 [-Wdeclaration-after-statement] size_t
| i16 = 0; ^ 2 warnings generated.
| ar cr lib/libmonocypher.a lib/monocypher.o lib/monocypher-
| ed25519.o clang -std=c99 -O3 -Weverything -shared
| -Wl,-soname,libmonocypher.so.4 -o lib/libmonocypher.so.4
| lib/monocypher.o lib/monocypher-ed25519.o ln -sf
| `basename lib/libmonocypher.so.4` lib/libmonocypher.so
| doc/doc_gen.sh
|
| ---
|
| Well...
|
| Yeah.
|
| _That_ is cranking up warnings up to 11. Now this is C99 we
| 're talking about, how about removing that obviously useless
| `-Wdeclaration-after-statement`? $ make
| "CC=clang -std=c99" "CFLAGS=-O3 -Weverything -Wno-declaration-
| after-statement" clang -std=c99 -O3 -Weverything -Wno-
| declaration-after-statement -I src -I src/optional -fPIC -c -o
| lib/monocypher.o src/monocypher.c In file included from
| src/monocypher.c:54: src/monocypher.h:289:9: warning:
| padding size of 'crypto_poly1305_ctx' with 4 bytes to alignment
| boundary [-Wpadded] typedef struct { ^
| 1 warning generated.
|
| Ah, this one's may be real. Let's see this struct:
| typedef struct { // Do not rely on the size or contents
| of this type, // for they may change without notice.
| uint8_t c[16]; // chunk of the message size_t c_idx;
| // How many bytes are there in the chunk. uint32_t r
| [4]; // constant multiplier (from the secret key)
| uint32_t pad[4]; // random number added at the end (from the
| secret key) uint32_t h [5]; // accumulated hash }
| crypto_poly1305_ctx;
|
| Ah, I see: we end by an odd number of 32-bit words, and in the
| 64-bit architecture I'm compiling this on, there will be 4
| bytes of padding. But then I have two problems: first, who
| cares? Second, I also target 32-bit architectures on which
| there will _not_ be any padding, and adding it manually would
| be wasteful.
|
| There's the `#pragma` to silence the warning of course, but
| it's ugly, and I'm not sure it is strictly portable (unless the
| standard says compilers should ignore the `#pragma` if they
| don't understand it, but I confess haven't checked).
| remexre wrote:
| Iunno, for this example, isn't this actually UB according to
| standard C? I don't think it'd be that objectionable to put e.g.
| a semantic comment saying "yeah no this is actually not UB
| because we check in ./configure that your compiler is one known
| to not do bad things," * _if*_ such a thing were standardized
| across linting tools.
|
| Of course, that's a heckuva if.
| david2ndaccount wrote:
| This is not standard C, this is already in the land of compiler
| extensions and platform specific code. The behavior is defined
| by the compiler and no compiler should treat this as UB.
| remexre wrote:
| Is the code actually compiled with -fno-strict-aliasing
| (which IME is rare for userspace code, but I don't work with
| numerical code very much), or are you saying this is because
| of the types involved in the cast? I didn't see anything in
| [0] about a change in pointer semantics from using the vector
| extensions in the compilation unit?
|
| [0]: https://gcc.gnu.org/onlinedocs/gcc/Vector-
| Extensions.html#Ve...
| deathanatos wrote:
| It is using compiler intrinsics, I grant, and taking
| advantage of IB is yes, well-defined given an implementation
| (as we have here) ... but it seems like it's still C?
|
| Here, I'm not 100% certain this code isn't UB: we type-pun a
| __m512i and a uint64_t ... I'm not certain that's not UB. If
| I'm forced into writing C, I try to avoid doing that at all
| costs ... I'd rather not encounter nasal demons.
|
| The warning also seems a bit merited, given that each
| "increment" is over more than a single object on the
| underlying array. (But, of course, that is the intent. But
| I'm just not sure the static analyzer can see it.)
| cpeterso wrote:
| Plus, a pointer to __m512i might require different
| alignment than a pointer to unsigned long.
|
| This code received a pointer to unsigned long from
| somewhere that might have cared about the data it pointed
| to, so his code should acknowledge that (with type casts or
| assertions) or make a __m512i or void* pointer part of this
| function's signature, making the type change part of the
| contract with the caller.
| eklitzke wrote:
| This is not UB.
|
| In this case where you're using the wrong pointer type the
| worst thing that can happen is you can have aliasing issues.
| For example, if I have an int* and a double*, the C compiler is
| free to assume that the pointers can never alias each other so
| it can optimize loads and stores assuming that an update to the
| int pointer will never affect the double pointer. In general
| this can cause problems if the pointers actually do alias each
| other, but if there's no aliasing occurring then nothing bad
| will happen.
| remexre wrote:
| My understanding was that the UB was defined such that:
| int i = 0; float f = 1.0f; void* p;
| if(rand() & 1) p = &i; else p =
| &f; int j = *(int*)p; assert(i ==
| j);
|
| could legally be compiled to a no-op (or a single call to
| rand() with an ignored result, I suppose), despite there
| being no "weird" pointer stores, since the ill-typed-at-
| runtime load "can't happen," so the compiler can reason that
| (int*)p must equal &i, and hence j = i.
|
| EDIT: And, in fact, gcc does this:
| https://godbolt.org/z/WM33vzzdo
|
| Not sure if that applies to vector loads, and the gcc docs
| don't specify, though.
___________________________________________________________________
(page generated 2023-03-15 23:00 UTC)