[HN Gopher] Zero or Sign Extend
       ___________________________________________________________________
        
       Zero or Sign Extend
        
       Author : todsacerdoti
       Score  : 101 points
       Date   : 2024-10-24 00:48 UTC (22 hours ago)
        
 (HTM) web link (fgiesen.wordpress.com)
 (TXT) w3m dump (fgiesen.wordpress.com)
        
       | eqvinox wrote:
       | > ... this explicitly relies on shifting something into the sign
       | bit, which depending on the exact flavor of language standard
       | you're using is either not allowed or at best fairly recently ..
       | 
       | An unsigned has no sign bit, so the left shift just needs to be
       | unsigned to make it "technically correct".
       | 
       | (Remember to not use smaller than int types though, due to
       | integer promotion issues)
        
         | fluoridation wrote:
         | Yup. When you're twiddling bits you're better off using
         | unsigned types in general anyway, and leaving converting to a
         | signed type at the very end.
        
       | jchw wrote:
       | Of course doing the undefined thing works on almost any platform
       | except DS9k, but that last formulation is quite elegant. It's a
       | bit like byteswapping in that it's fairly simple to do but it's
       | even simpler to _not_ do by just never relying on the machine
       | endianness.
        
         | Sesse__ wrote:
         | Also shifts, especially variable-length shifts, are frequently
         | slower than xor and add/sub (e.g., on x86, shl only works with
         | cl and shlx has high latency), so that's another score for the
         | xor variant.
        
           | CalChris wrote:
           | Maybe for _variable_ variable-length shifts but for
           | _constant_ variable-length shifts, _SHL reg, imm8_ is single
           | cycle on recent x86_64 microarchitectures.
        
             | Sesse__ wrote:
             | But xor and sub can go in way more ports, giving you higher
             | throughput.
        
               | dzaima wrote:
               | "way more" is 2 vs 4 ports (-5 for >=alderlake); 1/cycle
               | via shifts is probably good enough for most use-cases
               | (though perhaps the more focused port pressure could be
               | an issue with larger context).
               | 
               | And with hard-coded immediates xor+sub also ends up at
               | twice the code size as shl+shr, so there's some trade-
               | off. (but yeah if code size isn't a concern, xor+sub wins
               | out)
        
       | Neywiny wrote:
       | This is the perfect spot to use a bitfield. You can tell it
       | signed or unsigned, and the compiler will deal with it all and
       | optimize. No bit ops to get wrong or maintain. Very readable and
       | scalable.
        
         | edflsafoiewq wrote:
         | But the width and signedness of a bitfield are defined at
         | compile-time, while in this example they need to come from a
         | format read at runtime.
        
           | cryptonector wrote:
           | So? The author knows a priori the size of the int on the
           | wire.
        
         | almostgotcaught wrote:
         | > the compiler
         | 
         | I love when people say this as if there's exactly one compiler
         | with a fixed implementation for whatever opt pass.
        
           | AlotOfReading wrote:
           | That's not how this phrase is used. It usually encompasses
           | any reasonably advanced compiler like clang, GCC, and
           | sometimes MSVC.
        
             | Joker_vD wrote:
             | But not including any of the slightly broken C compilers
             | that the embedded hardware manufacturers provide (also, ICC
             | neither)?
        
               | AlotOfReading wrote:
               | I'm just providing examples, not excluding everything
               | unmentioned.
        
               | adgjlsfhk1 wrote:
               | as of a few years ago, ICC is just LLVM with some tweaked
               | settings
        
         | monocasa wrote:
         | Endianness of bit fields changes with arch. Ie. Is the first
         | bit field member the most or least significant bit range of the
         | associated word.
        
           | cryptonector wrote:
           | Yes, first you have to swab, if you have to swab.
        
         | epcoa wrote:
         | Not in C or C++, at least, the bit and byte order is not
         | defined.
        
         | gpderetta wrote:
         | At least GCC was very conservative in dealing with bitfields
         | and, last time I bothered to check, generated suboptimal code.
        
       | IshKebab wrote:
       | Eh the author's suggestions only seem better because C++ is
       | insane.
       | 
       | The last one is definitely nice though!
        
         | vlovich123 wrote:
         | Can you post examples in other languages where this would be
         | easier?
        
           | IshKebab wrote:
           | Sure, in Rust:                 fn sign_extend_u11(x: u32) ->
           | u32 {         (((x as i32) << (32-11)) >> (32-11)) as u32
           | }
           | 
           | Doesn't have any of the C++ issues he mentions. And it will
           | be faster than the alternative since it's just two
           | instructions. (Ok this is never going to matter in practice
           | but still...)
        
       ___________________________________________________________________
       (page generated 2024-10-24 23:01 UTC)