[HN Gopher] The Byte Order Fiasco
___________________________________________________________________
The Byte Order Fiasco
Author : cassepipe
Score : 269 points
Date : 2021-05-08 11:00 UTC (12 hours ago)
(HTM) web link (justine.lol)
(TXT) w3m dump (justine.lol)
| londons_explore wrote:
| Remember how we used to have machines with a 7 bit byte? And
| everything was written to handle either 6, 7, or 8 bit bytes?
|
| And now we've settled on all machines being 8 bit bytes, and
| programmers no longer have to worry about such details?
|
| Is it time to do the same for big endian machines? Is it time to
| accept that all machines that matter are little endian, and the
| extra effort keeping everything portable to big endian is no
| longer worth the mental effort?
| bumbada wrote:
| What happens is that all machines that matter are little endian
| but network works always in Big Endian.
| londons_explore wrote:
| We'll have to keep it as a quirk of history...
|
| A bit like the electron has a negative charge...
| PixelOfDeath wrote:
| They hat a 50/50 chance at getting the technical
| electricity direction right... and the fucked it up!
| bombcar wrote:
| Isn't big endian a bit more natural considered on a bit
| level? The bits start from highest to lowest on a serial
| connection.
| ben509 wrote:
| Big-endian is natural when you're comparing numbers, which
| is probably why people represent numbers in a big-endian
| fashion.
|
| Little-endian is natural with casts because the address
| doesn't change, and it's the order in which addition takes
| place.
| kangalioo wrote:
| I feel like big endian is more _intuitive_ because that's
| what our number notation has evolved to be.
|
| But more _natural_ is little endian because, well, it's
| just more straightforward to have the digits' magnitude
| be in ascending order (2^0, 2^1, 2^2, 2^3...) instead of
| putting it in reverse.
|
| Plus you encounter less roadblocks in practice with
| little endian (e.g. address changes with casts) which is
| often a sign of good natural design
| CydeWeys wrote:
| I'm curious how you're defining "natural", and if you
| think ISO-8601 is the reverse of "natural" too.
|
| All human number systems I've ever seen write numbers out
| as big Endian (yes, even Roman numerals), so I'm really
| struggling to see how that wouldn't be considered
| natural.
| ByteJockey wrote:
| It seems like it would be a more natural for representing
| the number when communicating with a human.
|
| But that's not what we're doing here, so it's not
| entirely relevant.
| pabs3 wrote:
| IBM is going to be pretty annoyed when your code doesn't work
| on their mainframes.
| jart wrote:
| In my experience IBM does the right thing and sends patches
| rather than asking us to fix their problems for them, and I
| respect them for that reason, even if it's a tiny burden to
| review those changes.
|
| However endianness isn't just about supporting IBM. Modern
| compilers will literally break your code if you alias memory
| using a type wider than char. It's illegal per the standard.
| In the past compilers would simply not care and say, oh the
| architecture permits unaligned reads so we'll just let you do
| that. Not anymore. Modern GCC and Clang force your code to
| conform to the abstract standard definition rather than the
| local architecture definition.
|
| It's also worth noting that people think x86 architecture
| permits unaligned reads but that's not entirely true. For
| example, you can't do unaligned read-ahead on C strings,
| because in extremely rare cases you might cross a page
| boundary that isn't defined and trigger a segfault.
| froh wrote:
| yes IBM provided asm for s390 hton ntoh, and "all we had to
| do" for mainframe Linux was patch x86 only packages to use
| hton ntoh when they persisted binary data. for the kernel
| IBM did it on their own, contributing mainline, for
| userland suse did it, grabbing some patches from japanese
| turbolinux, and then red hat grabbed the patches from turbo
| and suse, and together we got them mainline lol. and PPC
| then just piggybacked on top of that effort.
| einpoklum wrote:
| > Is it time to accept that all machines that matter are little
| endian.
|
| Well, no, because it's not the case. SPARC is big-endian, and a
| bunch of IBM processors. ARM processors are mostly bi-endian.
|
| > Is it time to do the same for big endian machines?
|
| No. Not just because of their prevalence, but because there
| isn't a compelling reason why everything should be little-
| endian.
| genmon wrote:
| That reminds me of a project to interface with vending
| machines. (We built a bookshop in a vending machine that would
| tweet whenever it sold an item, with automated stock
| management.)
|
| Vending machines have an internal protocol a little like I2C.
| We created a custom peripheral to bridge the machine to the
| web, based on a Raspberry Pi.
|
| The protocol was defined by Coca Cola Japan in 1975 (in order
| to have optionality in their supply chain). It's still in use
| today. But because it was designed in Japan, with a need for
| wide characters, it assumes 9 bit bytes.
|
| We couldn't find any way to get a Raspberry Pi to speak 9 bit
| bytes. The eventual solution was a custom shield that would
| read the bits, and reserialise to 8 bit bytes for the Pi to
| understand. And vice versa.
|
| 9 bit bytes. I grew up knowing that bytes had variable length,
| bit this was the first time I encountered it in the wild. This
| was 2015.
| raverbashing wrote:
| Well you could bit bang and the 9 bits wouldn't be an issue.
| (Even if you had a tiny PIC microcontroler just to do that)
|
| This is best solvable the closer to the device in question
| and in the simplest way possible.
| ghoward wrote:
| Sorry, dumb question: what is bit banging?
| gspr wrote:
| The practice of using software to literally toggle (or
| read) individual pins with the correct software-
| controlled timing in order to communicate with some
| hardware.
|
| To transmit a bit pattern 10010010 over a single pin
| channel, for example, you'd literally set the pin high,
| sleep for a some predetermined amount of time, set it
| low, sleep, set it low, sleep, set it high, etc.
| thristian wrote:
| In order to exchange data over a serial connection, the
| ones and zeroes have to be sent with exact timing, so the
| receiver can reliably tell where one bit ends and the
| next begins. Because of this, the hardware that's doing
| the communication can't do anything else at the same
| time. And since the actual mechanics of the process are
| simple and straightforward, most computers with a serial
| connection have special serial-interface hardware (a
| Universal Asynchronous Receiver/Transmitter, or UART) to
| take care of it - the CPU gives the UART some data, then
| returns to more productive pursuits while the UART works
| away.
|
| But sometimes you can't use a UART: maybe you're working
| on a tiny embedded computer without one, or maybe you
| need to speak a weird 9-bit protocol a standard UART
| doesn't understand. In that case, you can make the CPU
| pump the serial line directly. It's inefficient (there's
| probably more interesting work the CPU could be doing)
| and it can be difficult to make the CPU pause for
| _exactly_ the right amount of time (CPUs are normally
| designed to run as fast or as efficiently as possible,
| nothing in between), but it 's possible and sometimes
| it's all you've got. That's bit-banging.
| londons_explore wrote:
| Consider being a teacher. Thats a good explanation.
| stefan_ wrote:
| The irony is that while a tiny PIC can do bit banging
| easily, the mighty Pi will struggle with it.
| hazeii wrote:
| I'm familiar with both, and have Pi's bit-banging at
| 8MHz. It's not hard-realtime like a PIC though (where
| I've bitbanged a resistor D2A hung off a dsPIC33 to
| 17.734475MHz). It's an improvement over the years, but
| surprisingly little since bit-banging 4MHz Z80's more
| than 4 decades ago, where resolution was 1 T state
| (250ns).
| londons_explore wrote:
| The 9 bit serial OP mentioned likely doesn't have a
| seperate clock line, so it is hard realtime and timing
| matters a _lot_ , and I doubt the Pi could reliably do
| anything over 1 kHz baud with bit banging. You could do
| much better if you didn't run Linux.
| BlueTemplar wrote:
| We really should have moved to 32 bit bytes when moving to 64
| bit words. Would have simplified Unicode considerably.
| wongarsu wrote:
| People were holding off on transitioning because pointers
| use twice as much space in x64. If bytes had quadrupled in
| space with x64 we would still be using 32 bit software
| everywhere
| BlueTemplar wrote:
| Well, obviously it would have delayed the transition.
| However you can only go so far with 4Go-limited memory.
|
| And do you have examples of still widely used 8-bit sized
| data formats ?
| jart wrote:
| RGB and Y'CbCr
| Narishma wrote:
| You can go very far with just 4GB of memory, especially
| when not using wasteful software.
| owl57 wrote:
| I assume you wrote this comment in UTF-8 over HTTP
| (ASCII-based) and TLS (lots of uint8 fields).
| jart wrote:
| Use Erlang. It has 32-bit char.
| toast0 wrote:
| Not really. Strings are a list of integers [1], integers
| are signed and fill a system word, but there's also 4
| bits of type information. So you can have a 28-bit signed
| integer char on a 32-bit system or a signed 60-bit
| integer.
|
| However, since Unicode is limited to 21-bits by utf-16
| encoding, a unicode code point will fit in a small
| integer.
|
| [1] unless you use binaries, which is often a better
| choice.
| spacechild1 wrote:
| You know, bytes are not only about text, they are also used
| to represent _binary_ data...
|
| Not to mention that bytes have nothing to do with unicode.
| Unicode codepoints can be encoded in many different ways:
| UTF8, UTF16, UTF32, etc.
| ChrisSD wrote:
| Not really. Unicode is a variable width abstract encoding;
| a single character can be made up of multiple code points.
|
| For Unicode, 32-bit bytes would be an incredibly wasteful
| in memory encoding.
| BlueTemplar wrote:
| One byte = one "character" makes for much easier
| programming.
|
| Text generally uses a small fraction of memory and
| storage these days.
| cygx wrote:
| Not all user-perceived characters can be represented as a
| single Unicode codepoint. Hence, Unicode text encodings
| (almost[1]) always have to be treated as variable length,
| even UTF-32.
|
| [1] at runtime, you could dynamically assign 'virtual'
| codepoints to grapheme clusters and get a fixed-length
| encoding for strings that way
| jart wrote:
| Even the individual unicode codepoints themselves are
| variable width if we consider that things like cjk and
| emoji take up >1 monospace cells.
| lanstin wrote:
| Every time I see one of these threads, my gratitude to
| only do backend grows. Human behavior is too complex, let
| the webdevs handle UI, and human languages are too
| complex, not sure what speciality handles that. Give me
| out of order packets and parsing code that skips a
| character if the packet length lines up just so any day.
|
| I am thankful that almost all the Unicode text I see is
| rendered properly now, farewell the little boxes. Good
| job lots of people.
| jart wrote:
| I think we really have the iPhone jailbreakers to thank
| for that. U.S. developers were allergic almost offended
| by anything that wasn't ASCII and then someone released
| an app that unlocked the emoji icons that Apple had
| originally intended only for Japan. Emoji is defined in
| the astral planes so almost nothing at the time was
| capable of understanding them, yet were so irresistible
| that developers worldwide who would otherwise have done
| nothing to address their cultural biases immediately
| fixed everything overnight to have them. So thanks to
| cartoons, we now have a more inclusive world.
| londons_explore wrote:
| I'm pretty sure Unicode was pretty widespread before the
| iphone/emoji popularity.
| cygx wrote:
| There's supporting Unicode, and 'supporting' Unicode. If
| you're only dealing with western languages, it's easy to
| fall into the trap of only 'supporting' Unicode. Proper
| emoji handling will put things like grapheme clusters and
| zero-width joiners on your map.
| kortex wrote:
| > One byte = one "character" makes for much easier
| programming.
|
| Only if you are naively operating in the Anglosphere /
| world where the most complex thing you have to handle is
| larger character sets. In reality, there's ligatures,
| diacritics, combining characters, RTL, nbsp, locales, and
| emoji (with skin tones!). Not to mention legacy encoding.
|
| And no, it does not use a "small fraction of memory and
| storage" in a huge range of applications, to the point
| where some regions have transcoding proxies still.
| AnIdiotOnTheNet wrote:
| This just doesn't seem right. Granted, I don't know much
| about your use case, but Raspberry Pi's are powerful
| computing devices and I find it difficult to believe there
| was no way to handle this without additional hardware.
| DeRock wrote:
| I'm not familiar with the "vending machine" protocol he's
| talking about, but it's entirely reasonable that it has
| certain timing requirements. Usually the way you interface
| with these is by having a dedicated HW block to talk the
| protocol, or by bit banging. The former wouldn't be
| supported on RPi because it's obscure, the latter requires
| tight GPIO timing control that is difficult to guarantee on
| a non-real-time system like the RPi usually runs.
| [deleted]
| DonHopkins wrote:
| We used to have machines with arbitrarily sized bytes, and 36
| bit words!
|
| http://pdp10.nocrew.org/docs/instruction-set/Byte.html
|
| >In the PDP-10 a "byte" is some number of contiguous bits
| within one word. A byte pointer is a quantity (which occupies a
| whole word) which describes the location of a byte. There are
| three parts to the description of a byte: the word (i.e.,
| address) in which the byte occurs, the position of the byte
| within the word, and the length of the byte.
|
| >A byte pointer has the following format:
| 000000 000011 1 1 1111 112222222222333333 012345
| 678901 2 3 4567 890123456789012345
| _________________________________________ | |
| | | | | | | POS | SIZE |U|I| X |
| Y | |______|______|_|_|____|__________________|
|
| >POS is the byte position: the number of bits from the right
| end of the byte to the right end of the word. SIZE is the byte
| size in bits.
|
| >The U field is ignored by the byte instructions.
|
| >The I, X and Y fields are used, just as in an instruction, to
| compute an effective address which specifies the location of
| the word containing the byte.
|
| "If you're not playing with 36 bits, you're not playing with a
| full DEC!" -DIGEX (Doug Humphrey)
|
| http://otc.umd.edu/staff/humphrey
| [deleted]
| stkdump wrote:
| Historical and obscure machines aside, there are a few things
| modern C++ code should take for granted, because even new systems
| will probably not bother breaking them: Text is encoded in UTF-8.
| Negative integers are twos-complement. Float is 32 bit ieee 754,
| double and long double are 64 bit ieee 754. Char is 8 bit, short
| is 16 bit, int is 32 bit, long long is 64 bit.
| pabs3 wrote:
| I wonder if those macros work with middle-endian systems.
| froh wrote:
| no. but hton(3)/ntoh(3) from inet.h do.
| dataflow wrote:
| Is this a joke or am I just unaware of any systems out there
| that are "middle-endian"..?!
| mannschott wrote:
| Sadly not a joke, but thankfully quite obscure:
| https://en.wikipedia.org/wiki/Endianness#Middle-endian
| hvdijk wrote:
| There are no current middle-endian systems but they used to
| exist. The PDP-11 is the most famous one. The macros would
| work on all systems, but as only very old systems are middle-
| endian, they also have old compilers so may not be able to
| optimise it as well.
| ttt0 wrote:
| https://twitter.com/m13253/status/1371615680068526081
|
| Would it hurt anyone to define this undefined behavior and do
| exactly what the source code says?
| MauranKilom wrote:
| Not sure what you think the source code "says". I mean, I know
| what you want it to mean, but just because integer wrapping is
| intuitive to you doesn't imply that that is what the code
| means. C++ abstract machine and all.
|
| But to answer the actual question: For C++20, integer types
| were revisited. It is now (finally) guaranteed that signed
| integers are two's complement, along with a list of other
| changes. See http://www.open-
| std.org/jtc1/sc22/wg21/docs/papers/2018/p090... also for how
| the committee voted on the individual issues.
|
| Note in particular:
|
| > The main change between [P0907r0] and the subsequent revision
| is to maintain undefined behavior when signed integer overflow
| occurs, instead of defining wrapping behavior. This direction
| was motivated by:
|
| > - Performance concerns, whereby defining the behavior
| prevents optimizers from assuming that overflow never occurs;
|
| > - Implementation leeway for tools such as sanitizers;
|
| > - Data from Google suggesting that over 90% of all overflow
| is a bug, and defining wrapping behavior would not have solved
| the bug.
|
| So yes, the committee very recently revisited this specific
| issue, and re-affirmed that signed integer overflow should be
| UB.
| ttt0 wrote:
| I haven't noticed the signed integer overflow, which does
| indeed complicate things, and I thought it was just the
| infinite loop UB.
|
| > Data from Google suggesting that over 90% of all overflow
| is a bug, and defining wrapping behavior would not have
| solved the bug.
|
| Of _all_ overflow? Including unsigned integers where the
| behavior is defined?
| aliceryhl wrote:
| That 90% of all overflows are bugs doesn't surprise me at
| all, even if you include unsigned integers.
| nly wrote:
| This is why, in 2021, the mantra that C is a good language for
| these low level byte twiddling tasks needs to die. Dealing with
| alignment and endianness properly requires a language that allows
| you to build abstractions.
|
| The following is perfectly well defined in C++, despite looking
| like almost the same as the original unsafe C:
| #include <boost/endian.hpp> #include <cstdio>
| using namespace boost::endian; unsigned char b[5] =
| {0x80,0x01,0x02,0x03,0x04}; int main() {
| uint32_t x = *((big_uint32_t*)(b+1));
| printf("%08x\n", x); }
|
| Note that I deliberately misaligned the pointer by adding 1.
|
| https://gcc.godbolt.org/z/5416oefjx
|
| [Edit] Fun twist: the above code doesn't work where the
| intermediate variable x is removed because printf itself is not
| type safe, so no type conversion (which is when the bswap is
| deferred to) happens. In pure C++ when using a type safe
| formatting function (like fmt or iostreams) this wouldn't happen.
| printf will let you throw any garbage in to it. tl;dr outside
| embedded use cases writing C in 2021 is fucking nuts.
| IgorPartola wrote:
| As a very minor counterpoint: I like C because frankly it's
| fun. I wouldn't start a web browser or maybe even an operating
| system in it today, but as a language for messing around I find
| it rewarding. I also think it is incredibly instructive in a
| lot of ways. I am not a C++ developer but ANSI C has a special
| place in my heart.
|
| Also, I will say that when it comes to programming Arduinos and
| ESP8266/ESP32 chips, I still find that C is my go to despite
| things like Alia, MicroPython, etc. I think it's possible that
| once Zig supports those devices fully that I might move over.
| But in the meantime I guess I'll keep minding my off by one
| errors.
| themulticaster wrote:
| This has nothing to do with C++ because your example only hides
| the real issue occurring in the blog post example: The
| unaligned read on the array. Try adding something like
| printf("%08x\n", *((uint32_t*)(b)));
|
| to your example and you'll see that it produces UB as well. The
| reason there is no UB with big_uint32_t probably is that that
| struct/class/whatever it is probably redefines its
| dereferencing operator to perform byte-wise reads.
|
| Godbolt example: https://gcc.godbolt.org/z/seWrb5cz7
| nly wrote:
| I fail to see your point. The point of my post is that the
| abstractions you can build in C++ are as easy to use and as
| efficient as doing things the wrong, unsafe way...so there's
| no reason not to do things in a safe, correct way.
|
| Obviously if you write C and compile it as C++ you still end
| up with UB, because C++ aims for extreme levels of
| compatibility with C.
| themulticaster wrote:
| Sorry for being unclear. My point is that the example in
| the blog post does two things, a) it reads an unaligned
| address causing UB and b) it performs byte-order swapping.
| The post then goes on about avoiding UB in part b), but all
| the time the UB was caused by the unaligned access in a).
|
| Of course your example solves both a) and b) by using
| big_uint32_t, and I agree that this is an interesting
| abstraction provided by Boost, but I think the takeaway
| "use C++ for low-level byte fiddling" is slightly
| misleading: Say I was a novice C++ programmer, saw your
| example of how C++ improves this but at the same time don't
| know that big_uint32_t solves the hassle of reading a word
| from an unaligned address for me. Now I use your pattern in
| my byte-fiddling code, but then I need to read a word in
| host endianness. What do I do? Right, I remember the HN
| post and write *((uint32_t*)(b+1)) (without the big_,
| because I don't need that!). And then I unintentionally
| introduced UB. In other words, big_uint32_t is a little
| "magic" in this case, as it suggests a similarity to
| uint32_t which does not actually exist.
|
| To be honest, I don't think the byte-wise reading is in any
| way inappropriate in this case: If you're trying to read a
| word _in non-native byte order from an unaligned access_ ,
| it is perfectly fine to be very explicit about what you're
| doing in my opinion. There also is nothing unsafe about
| doing this as long as you follow certain guidelines, as
| mentioned elsewhere in this thread.
| nly wrote:
| Sure, the only correct way to read an unaligned value in
| to an aligned data type in both C or C++ is via memcpy.
|
| I still think being able to define a type that models
| what you're doing is incredibly valuable because as long
| as you don't step outside your type system you get so
| much for free.
| sgtnoodle wrote:
| You could also mask and shift the value byte-wise just
| like with an endian swap. Depending on the destination
| and how aggressive the compiler optimizes memcpy or not,
| it could even produce more optimal code, perhaps by
| working in registers more.
|
| Conceptual consistency is a good thing, but there is a
| generally higher cognitive load to using C++ over C. I've
| used both C++ and C professionally, and I've gone deeper
| with type safety and metaprogramming than most folk. I've
| mostly used C for the last few years, and I don't feel
| like I'm missing anything. It's still possible to write
| hard-to-misuse code by coming up with abstractions that
| play to the language's strengths.
|
| Operator overloading in particular is something I've
| refined my opinion on over the years. My current thought
| is that it's best not to use operators in
| user/application defined APIs, and should be reserved for
| implementing language defined "standard" APIs like the
| STL. Instead, it's better to use functions with names
| that unambiguously describe their purpose.
| foldr wrote:
| What are the advantages of this over a simple function with the
| following signature? uint32_t
| read_big_uint32(char *bytes);
|
| Having a big_uint32_t type seems wrong to me conceptually. You
| should either deal with sequences of bytes with a defined
| endianness or with native 32-bit integers of indeterminate
| endianness (assuming that your code is intended to be endian
| neutral). Having some kind of halfway house just confuses
| things.
| nly wrote:
| The library provides those functions too, but I don't see how
| having an arithmetic type with well defined size, endiannness
| and alignment is a bad thing.
|
| If you're defining a struct to mirror a data structure from a
| device, protocol or file format then the language / type
| system should let you define the properties of the fields,
| not necessarily force you to introduce a parsing/decoding
| stage which could be more easily bypassed.
| lanstin wrote:
| It is no longer arithmetic if there is an endianness. Some
| things are numbers and some things are sequences of bytes.
| Arithmetic only works on the former.
| mfost wrote:
| I'd say, putting multiple of those types into a struct that
| then perfectly describes the memory layout of each byte of
| data in memory/network packet in a reliable and user friendly
| way to manipulate for the coder.
| foldr wrote:
| I see. That does seem helpful once you consider how these
| types compose, rather than thinking about a one-off
| conversion. However, I think it would be cleaner to have a
| library that auto-generated a parser for a given struct
| paired with an endianness specification, rather than baking
| the endianness into the types. (Probably this could be
| achieved by template metaprogramming too.)
| themulticaster wrote:
| I agree, but a little nitpick: A sequence of bytes does not
| have a defined endianness. Only groups of more than one bytes
| (i.e. half words, words, double words or whatever you want to
| call them) have an endianness.
|
| In practice, most projects (e.g. the Linux kernel or the
| socket interface) differentiate between host (indeterminate)
| byte order and a specific byte order (e.g. network byte
| order/big endian).
| jeffreygoesto wrote:
| Wouldn't that cast be UB because it is type punning?
| professoretc wrote:
| char* is a allowed to alias to other pointer types.
| jeffreygoesto wrote:
| Hm. Afsik, you are always allowed to convert _to_ a char,
| but _from_ is not ok in general. See i.e. [0]
|
| [0] https://gist.github.com/shafik/848ae25ee209f698763cffee
| 272a5...
| jstimpfle wrote:
| I find you missed the point of the post and the issues
| described in it.
|
| In my estimation, libraries like boost are way too big and way
| too clever and they create more problems than they solve. Also,
| they don't make me happy.
|
| You're overfocusing on a "problem" that is almost completely
| irrelevant for most of programming. Big endian is rare to be
| found (almost no hardware to be found, but some file formats
| and networking APIs have big-endian data in them). Where you
| still meet it, you don't do endianness conversions willy-nilly.
| You have only a few lines in a huge project that should be
| concerned with it. Similar situation for dealing with aligned
| reads.
|
| So, with boost you end up with a huge slow-compiling dependency
| to solve a problem using obscure implicit mechanisms that
| almost no-one understands or can even spot (I would never have
| guessed that your line above seems to handle misalignment or
| byte swapping).
|
| This approach is typical for a large group of C++ programmers,
| who seem to like to optimize for short code snippets,
| cleverness, and/or pedantry.
|
| The actual issue described in the post was the UB that is easy
| to hit when doing bit shifting, caused by the implicit
| conversions that are defined in C. While this is definitely an
| unhappy situation, it's easy enough to avoid this using plain C
| syntax (cast expression to unsigned before shifting), using not
| more code than the boost-type cast in your above code.
|
| The fact that the UB is so easy to hit doesn't call for
| excessive abstraction, but simply a revisit of some of the UB
| defined in C, and how compiler writers exploit it.
|
| (Anecdata: I've written a fair share of C code, while not
| compression or encryption algorithms, and personally I'm not
| sure I've ever hit one of the evil cases of UB. I've hit
| Segmentation faults or had Out-of-bounds accesses, sure, but
| personally I've never seen the language or compilers "haunt
| me".)
| jart wrote:
| Do you use UBSAN and ASAN? When you write unit tests do you
| feed numbers like 0x80000000 into your algorithm? When you
| allocate test memory have you considered doing it with
| mmap(4096) and putting the data at the _end_ of the map? (Or
| better yet, double it and use mprotect). Those are some good
| examples of torture tests if you 're in the mood to feel
| haunted.
| [deleted]
| sdenton4 wrote:
| Every day I spend futzing around with endianness is a day I'm
| not solving 'real' problems. These things are a distraction
| and a complete waste of developer time: It should be solved
| 'once' and only worried about by people specifically looking
| to improve on the existing solution. If it can't be handled
| by a library call, there's something really broken in the
| language.
|
| (imo, both c and cpp are mainly advocated by people suffering
| from stockholm syndrome.)
| raphlinus wrote:
| I agree with the bulk of this post.
|
| Re the anecdata at the end. Have you ever run your code
| through the sanitizers? I have. CVE-2016-2414 is one of my
| battle scars, and I consider myself a pretty good programmer
| who is aware of security implications.
| jstimpfle wrote:
| Very little, quite frankly. I've used valgrind in the past,
| and found very few problems. I just ran
| -fsanitize=undefined for the first time on one of my
| current projects, which is an embedded network service of
| 8KLOC, and with a quick test covering probably 50% of the
| codepaths by doing network requests, no UB was detected (I
| made sure the sanitizer works in my build by introducing a
| (1<<31) expression).
|
| Admittedly I'm not the type of person who spends his time
| fuzzing his own projects, so my statement was just to say
| that the kind of bugs that I hit by just testing my
| software casually are almost all of the very trivial kind -
| I've never experienced the feeling that the compiler
| "betrayed" me and introduced an obscure bug for something
| that looks like correct code.
|
| I can't immediately see the problem in your CVE here [0],
| was that some kind of betrayal by compiler situation? Seems
| like strange things could happen if (end - start)
| underflows.
|
| [0] https://android.googlesource.com/platform/frameworks/mi
| nikin...
| raphlinus wrote:
| This one wasn't specifically "betrayal by compiler," but
| it was a confusion between signed and unsigned quantities
| for a size field, which is very similar to the UB
| exhibited in OP.
|
| Also, the fact that you can't see the problem is actually
| evidence of how insidious these problems are :)
|
| The rules for this are arcane, and, while the solution
| suggested in OP is correct, it skates close to the edge,
| in that there are many similar idioms that are not ok. In
| particular, (p[1] << 8) & 0xff00, which is code I've
| written, is potentially UB (hence "mask, and then shift"
| as a mantra). I'd be surprised if anyone other than jart
| or someone who's been part of the C or C++ standards
| process can say why.
| [deleted]
| vlovich123 wrote:
| Raph, clearly you're just not as good a programmer as you
| think you are.
| raphlinus wrote:
| Why thank you Vitali. Coming from you, that is high
| praise indeed.
| vladharbuz wrote:
| Correct me if I'm wrong, but your example is just using a
| library to do the same task, rather than illustrating any
| difference between C and C++. If you want to pull boost in to
| do this, that's great, but that hardly seems like a fair
| comparison to the OP, since instead of implementing code to
| solve this problem yourself you're just importing someone
| else's code.
| nly wrote:
| No, the fact that this can be done in a library and looks
| like a native language feature demonstrates the power of C++
| as a language.
|
| This example is demonstrating:
|
| - First class treatment of user (or library) defined types
|
| - Operator overloading
|
| - The fact that it produces fast machine code. Try changing
| big_uint32_t to regular uint32_t to see how this changes.
| When you use the later ubsan will introduce a trap for
| runtime checks, but it doesn't need to in this case.
| simias wrote:
| Operator overloading is a mixed blessing though, it can be
| very convenient but it's also very good at obfuscating
| what's going on.
|
| For instance I'm not familiar with this boost library so
| I'd have a lot of trouble piecing out what your snippet
| does, especially since there's no explicit function call
| besides the printf.
|
| Personally if we're going the OOP route I'd much prefer
| something like Rust's `var.to_be()`, `var.to_le` etc... At
| least it's very explicit.
|
| My hot take is that operator overloading should only ever
| be used for mathematical operators (multiplying vectors
| etc...), everything else is almost invariably a bad idea.
| pwdisswordfish8 wrote:
| Ironically, it was proposed not so long ago to deprecate
| to_be/to_le in favour of to_be_bytes/to_le_bytes, since
| the former conflate abstract values with bit
| representations.
| nly wrote:
| That's fine if whatever type 'var' happens to be is NOT
| usable as an arithmetic type, otherwise you can easily
| just forget to call .to_le() or .to_native(), or
| whatever, and end up with a bug. I don't know Rust, so
| don't know if this is the case?
|
| Boost.Endian actually lets you pick between arithmetic
| and buffer types.
|
| 'big_uint32_buf_t' is a buffer type that requires you to
| call .value() or do a conversion to an integral type. It
| does not support arithmetic operations.
|
| 'big_uint32_t' is an arithmetic type, and supports all
| the arithmetic operators.
|
| There are also variants of both endian suffixed '_at' for
| when you know you have aligned access.
| raphlinus wrote:
| The idiomatic way to do this in Rust is to use functions
| like .to_le_bytes(), so you have the u32 (or whatever) on
| one end and raw bytes (something like [u8; 4]) on the
| other. It can get slightly tedious if you're doing it by
| hand, but it's impossible to accidentally forget. If
| you're doing this kind of thing at scale, like dealing
| with TrueType fonts (another bastion of big-endian), it's
| common to reach for derive macros, which automate a great
| deal of the tedium.
| nly wrote:
| Who decides what methods to add to the bytes
| type/abstraction?
|
| If I have a 3 byte big endian integer can I access it
| easily in rust without resorting to shifts?
|
| In C++ I could probably create a fairly convincing
| big_uint24_t type and use it in a packed struct and there
| would be no inconsistencies with how it's used with
| respect to the more common varieties
| raphlinus wrote:
| In Rust, [u8; N] and &[u8] are both primitive types, and
| not abstractions. It's possible to create an abstraction
| around either (the former even more so now with const
| generics), but that's not necessary. It's also possible
| to use "extension traits" to add methods, even to
| existing and built-in types[1].
|
| I'm not sure about a 3 byte big endian integer. I mean,
| that's going to compile down to some combination of
| shifting and masking operations anyway, isn't it? I
| suspect that if you have some oddball binary format that
| needs, this it will be possible to write some code to
| marshal it, that compiles down to the best possible asm.
| Godbolt is your friend here :)
|
| [1]: https://rust-lang.github.io/rfcs/0445-extension-
| trait-conven...
| nly wrote:
| I agree then that in Rust you could make something
| consistent.
|
| I think there's no need for explicit shifts. You need to
| memcpy anyway to deal with alignment issues, so you may
| as well just copy in to the last 3 bytes of a zero-
| initialized, big endian, 32bit uint.
|
| https://gcc.godbolt.org/z/jEnsW8WfE
| raphlinus wrote:
| That's just constant folding. Here's what it looks like
| when you actually need to go to memory:
|
| https://gcc.godbolt.org/z/9qGqh6M1E
|
| And I think we're on the same page, it should be possible
| to get similar results in Rust.
| cbmuser wrote:
| You are still casting one pointer type into another which
| can result in unaligned access.
|
| If you need to change byte orders, you should use library
| to achieve that.
| nly wrote:
| Boost.Endian is the library here and this code is safe
| because the big_uint32_t type has an alignment
| requirement of 1 byte.
|
| This is why ubsan is silent and not even injecting a
| check in to the compiled code.
|
| You can check the alignment constraints with
| static_assert (something else you can't do in standard
| C): https://gcc.godbolt.org/z/KTcf9ax6r
| kevin_thibedeau wrote:
| C11 has static_assert:
| https://gcc.godbolt.org/z/E3bGc95o3
|
| Is also has _Generic() so you can roll up a family of
| endianness conversion functions and safely change types
| without blowing up somewhere else with a hardcoded
| conversion routine.
| Brian_K_White wrote:
| It demonstrates that c++ is even less safe.
| 0x000000E2 wrote:
| By the same token, I think most uses for C++ these days are
| nuts. If you're doing a greenfield project 90% of the time it's
| better to use Rust.
|
| C++ has a multitude of its own pitfalls. Some of the C
| programmer hate for C++ is justified. After all, it's just C
| with a pre-processing stage in the end.
|
| There's good reasons why many C projects never considered C++
| but are already integrating the nascent Rust. I always hated
| low level programming until Rust made it just as easy and
| productive as high level stuff
| jart wrote:
| C is perfect for these problems. I like teaching the endian
| serialization problem because it broaches so many of the topics
| that are key to understanding C/C++ in general. Even if we
| choose to spend the majority of our time plumbing together
| functions written by better men, it's nice to understand how
| the language is defined so we could write those functions, even
| if we don't need to.
| nly wrote:
| For sure, it's a good way to teach that C is insufficient to
| deal with even the simplest of tasks. Unfortunately teaching
| has a bad habit of becoming practice, no matter how good the
| intention.
|
| With regard to teaching C++ specifically I tend to agree with
| this talk:
|
| CppCon 2015 - Kate Gregory "Stop Teaching C":
| https://www.youtube.com/watch?v=YnWhqhNdYyk
| jart wrote:
| One of her slides was titled "Stop teaching pointers!" too.
| My VP back at my old job snapped at me once because I got
| too excited about the pointer abstractions provided by
| modern C++. Ever since that day I try to take a more
| rational approach to writing native code where I consider
| what it looks like in binary and I've configured my Emacs
| so it can do what clang.godbolt.org does in a single
| keystroke.
| nly wrote:
| For the record, she's not really saying people shouldn't
| learn this low level stuff... just that 'intro to C++'
| shouldn't be teaching this stuff _first_
|
| The biggest problem with C++ in industry is that people
| tend to write "C/C++" when it deserves to be recognized
| as a language in its own right.
| jart wrote:
| One does not simply introduce C++. It's the most insanely
| hardcore language there is. I wouldn't have stood any
| chance understanding it had it not been for my gentle
| introduction with C for several years.
| SAI_Peregrinus wrote:
| C++ makes Rust look easy to learn.
| pjmlp wrote:
| Really?
|
| Apparently the first year students at my university
| didn't had any issue going from Standard Pascal to C++,
| in the mid-90's.
|
| Proper C++ was taught using our string, vector and
| collection classes, given that we were still a couple of
| years away from ISO C++ being fully defined.
|
| C style programming with low level tricks were only
| introduced later as advanced topics.
|
| Apparently thousands of students managed to get going the
| remaining 5 years of the degree.
| BenjiWiebe wrote:
| C++ in the mid 90s was a lot simpler than C++ now.
| pjmlp wrote:
| No one obliges you to write C++20 with SFINAE template
| meta-programming, using classes with CTAD constructors.
|
| Just like no Python newbie is able to master Python 3.9
| full language set, standard library, numpy, pandas,
| django,...
| jart wrote:
| Well there's a reason universities switched to Java when
| teaching algorithms and containers after the 90's. C++ is
| a weaker abstraction that encourages the kind of
| curiosity that's going to cause a student's brain to melt
| the moment they try to figure out how things work and
| encounter the sorts of demons the coursework hasn't
| prepared them to face. If I was going to teach it, I'd
| start with octal machine codes and work my way up.
| https://justine.lol/blinkenlights/realmode.html Sort of
| like if I were to teach TypeScript then I'd start with
| JavaScript. My approach to native development probably
| has more in common with web development than it does with
| modern c++ practices to be honest, and that's something I
| talk about in one of my famous hacks: https://github.com/
| jart/cosmopolitan/blob/4577f7fe11e5d8ef0a...
| pjmlp wrote:
| US universities maybe, there isn't much Java on my former
| university learning plan.
|
| The only subjects that went full into Java were
| distributed computing and compiler design.
|
| And during the last 20 years they already went back into
| their decision.
|
| I should note that languages like Prolog, ML and
| Smalltalk were part of the learning subjects as well.
|
| Assembly was part of electronic subjects where design of
| a pseudo CPU was also part of the themes. So we had our
| own pseudo Assembly, x86 and MIPS.
| jcelerier wrote:
| > Well there's a reason universities switched to Java
| when teaching algorithms and containers after the 90's
|
| Where ? I learned algorithms in C and C++ (and also a bit
| in Caml and LISP) and I was in university 2011-2014
| ta988 wrote:
| Yes this is the curse of knowledge, people that know c++
| by their exposure to it for decades are usually unable to
| bring any new comer to it.
| microtherion wrote:
| Yes, there is some value in using C for teaching these
| concepts. But the problem I see is that, once taught, many
| people will then continue to use C and their hand written
| byte swapping functions, instead of moving on to languages
| with better abstraction facilities and/or availing themselves
| of the (as you point out) many available library
| implementations of this functionality.
| ok123456 wrote:
| Or just use the functions in <arpa/inet.h> to convert from host
| to network byteorder?
| froh wrote:
| this! use hton/ntoh and be happy.
|
| nitpick: the 64bit versions are not fully available yet,
| htonll, ntohll
| Animats wrote:
| Rust gets this right. These primitives are available for all the
| numeric types. u32::from_le_byte(bytes) // u32
| from 4 bytes, little endian u32::from_be_byte(bytes) //
| u32 from 4 bytes, big endian u32::to_le_bytes(num) // u32
| to 4 bytes, little endian u32::to_be_bytes(num) // u32 to
| 4 bytes, big endian
|
| This was very useful to me recently as I had to write the
| marshaling and un-marshaling for a game networking format with
| hundreds of messages. With primitives like this, you can see
| what's going on.
| infradig wrote:
| There are equivalent functions in C too. The point of the
| article is about not using them. So how would you implement the
| above functions in Rust would be more pertinent.
| froh wrote:
| isnt the point to be careful when implementing them? so the
| compiler detects the intention to byteswap?
|
| when we ported little endian x86 Linux to the big endian
| mainframe we sprinkled hton/ntoh all over the place, happily
| so. they are the way to go and they should be implemented
| properly, not be replaced by a homegrown version.
|
| all that said, I'm surprised 64bit htonll and ntohll are not
| standard yet. anybody knows why?
| thechao wrote:
| Blech. I learned to program (around '99) by implementing
| the crusty old FCS1.0 format, which allows for aggressively
| weird wire formats. Our machine was a PDP-11/72 with its
| head sawzalled off and custom wire wrap boards dropped in.
| The "native" format (coming from analog) was 2143 order as
| a 36b packet. The bits were [8,0:7] (using verilog
| notation). However, sprinkled randomly in the binary header
| were chunks of 7- and 8- bit ANSI (packed) and some mutant
| knockoff 6-bit EBCDIC.
|
| The original listing was written by "Jennifer -- please
| call me if you have troubles", an undergraduate from MIT.
| It was hand-assembled machine code, in a neat hand in a big
| blue binder. That code ran non-stop except for a few
| hurricanes from 1988 until 2008; bug-free as far as I could
| tell. Jennifer last-name-unknown, you were my idol & my
| demon!
|
| I swore off programming for nearly a year after that.
| Negitivefrags wrote:
| Unless you are planning on running your game on a mainframe,
| just don't bother with endianness for the networking.
|
| Big endian is dead for game developers.
|
| Copy entire arrays of structs onto the wire without fear!
|
| (Just #pragma pack them first)
| musicale wrote:
| > game on a mainframe
|
| Maybe your program isn't a game.
|
| Maybe you have to deal a server that uses Power, or an
| embedded system that uses PowerPC (or ARM or MIPS in big-
| endian mode).
|
| Maybe you're running on an older architecture (SPARC,
| PowerPC, 68K.)
|
| Maybe you have to deal with a pre-defined data format (e.g.
| TCP/IP packet headers) that uses big-endian byte ordering for
| some of its components.
| Aeolun wrote:
| That's theoretically possible. But I'd be very interested
| in why. Especially if you are doing anything involving
| networking.
| einpoklum wrote:
| This is valid code in C++20: if constexpr
| (std::endian::native == std::endian::big) { std::cout
| << "big-endian" << '\n'; } else if constexpr
| (std::endian::native == std::endian::little) {
| std::cout << "little-endian" << '\n'; } else {
| std::cout << "mixed-endian" << '\n'; }
|
| Doesn't solve everything, but it's saner even if what you're
| writing is C-style low-level code.
| st_goliath wrote:
| FWIW there is a <sys/endian.h> on various BSDs that contains
| "beXXtoh", "leXXtoh", "htobeXX", "htoleXX" where XX is a number
| of bits (16, 32, 64).
|
| That header is also available on Linux, but glibc (and compatible
| libraries) named it <endian.h> instead.
|
| See: man 3 endian (https://linux.die.net/man/3/endian)
|
| Of course it gets a bit hairier if the code is also supposed to
| run on other systems.
|
| MacOS has OSSwapHostToLittleIntXX, OSSwapLittleToHostIntXX,
| OSSwapHostToBigIntXX and OSSwapBigToHostIntXX in
| <libkern/OSByteOrder.h>.
|
| I'm not sure if Windows has something similar, or if it even
| supports running on big endian machines (if you know, please
| tell).
|
| My solution for achieving some portability currently entails
| cobbling together a "compat.h" header that defines macros for the
| MacOS functions and including the right headers. Something like
| this:
|
| https://github.com/AgentD/squashfs-tools-ng/blob/master/incl...
|
| This is usually my go-to-solution for working with low level on-
| disk or on-the-wire binary data structures that demand a specific
| endianness. In C I use "load/store" style functions that memcpy
| the data from a buffer into a struct instance and do the endian
| swapping (or reverse for the store). The copying is also
| necessary because the struct in the buffer may not have proper
| alignment.
|
| Technically, the giant macro of doom in the article takes care of
| all of this as well. But unlike the article, I would very much
| not recommend hacking up your own stuff if there are systems
| libraries readily available that take care of doing the same
| thing in an efficient manner.
|
| In C++ code, all of this can of course be neatly stowed away in a
| special class with overloaded operators that transparently takes
| care of everything and "decays" into a single integer and exactly
| the above code after compilation, but is IMO somewhat cleaner to
| read and adds much needed type safety.
| anticristi wrote:
| Indeed, I don't get the article. It's like writing "C is hard
| because here is how hard it is to implement memcpy using SIMD
| correctly."
|
| Please don't do that. Use battle-tested low-level routines.
| Unless your USP is "our software swaps bytes faster than the
| competition", you should not spend brain power on that.
| nwallin wrote:
| Windows/MSVC has _byteswap_ushort(), _byteswap_ulong(),
| _byteswap_uint64(). (note that unsigned long is 32 bits on
| Windows) It's ugly but it works.
|
| Boost provides boost::endian which allows converting between
| native and big or little, which just does the right thing on
| all architectures and compilers and compiles down to a no-op or
| bswap instruction instruction. It's much better than writing
| (and testing!) your own giant pile macros and ifdefs to detect
| the compiler/architecture/OS, include the correct includes, and
| perform the correct conversions in the correct places.
| [deleted]
| tjoff wrote:
| At least historically windows have had big-endian versions as
| both SPARC and Itanium use big endian.
| electroly wrote:
| Itanium can be configured to run in either endianness (it's
| "bi-endian"). Windows on Itanium always ran in little-endian
| mode and did not support big-endian mode. The same was true
| of PowerPC. Windows never ran in big-endian mode on any
| architecture.
| Sebb767 wrote:
| In case anyone else wonders how the code in the linked tweet [0]
| would format your hard drive, it's the missing return on f1.
| Therefore, f1 is empty as well (no ret) and calling it will
| result in f2 being run. The commented out code is irrelevant.
|
| EDIT: Reading the bug report [1], the actual cause for the
| missing ret is that the for loop will overflow, which is UB and
| causes clang to not emit any code for the function.
|
| [0] https://twitter.com/m13253/status/1371615680068526081
|
| [1] https://bugs.llvm.org/show_bug.cgi?id=49599
| vlmutolo wrote:
| > If you program in C long enough, stuff like this becomes second
| nature, and it starts to almost feel inappropriate to even have
| macros like the above, since it might be more appropriately
| inlined into the specific code. Since there have simply been too
| many APIs introduced over the years for solving this problem. To
| name a few for 32-bit byte swapping alone: bswap_32, htobe32,
| htole32, be32toh, le32toh, ntohl, and htonl which all have pretty
| much the same meaning.
|
| > Now you don't need to use those APIs because you know the
| secret.
|
| This sentiment seems problematic. The solution shouldn't be "we
| just have to educate the masses of C programmers on how to
| properly deal with endianness". That will never happen.
|
| The solution should be "It's in the standard library. Go look
| there and don't think too hard." C is sufficiently low-level, and
| endianness problems sufficiently common, that I would expect that
| kind of routine to be available.
| lanstin wrote:
| The point is that keeping the distinction clear in your head
| between numeric semantics and sequence of octets semantics
| makes the problem universally tractible. You have a data
| structure where with a numeric value. Here you have a sequence
| of octets described by some protocol formalism, BNF in the old
| days. The mapping from one to the other occurs in the math
| between octets and numeric values and the various network
| protocols for representing numbers. There are many more choices
| than just big endian or little endian. Could be ASN infinite
| precision ints. Could be 32 bit IEEE floats or 64 bit IEEE
| floats. The distinction is universal between language semantics
| and external representations.
|
| This is why people that memcpy structs right into the buf get
| such derision, even if it's faster and written for a mono-
| Implementation of a language semantics. It is sloppy thought
| made manifest.
| pjmlp wrote:
| Typical C culture, you would also expect that by now something
| like SDS would be part of the standard as well.
|
| https://github.com/antirez/sds
| saagarjha wrote:
| Adding API that introduces an entirely new string model that
| is incompatible with the rest of the standard library seems
| like a nonstarter.
| secondcoming wrote:
| Isn't the 'modern' solution to memcpy into a temp and swap the
| bytes in that? C++ has added/will add std::launder and std::bless
| to deal with this issue
| lanstin wrote:
| No, it is to read a byte at a time and turn it into the
| semantic value for the data structure you are filling in. Like
| read 128 and then 1 and set the variable to 32769. If u are the
| author of protobufs then you may run profiling and write the
| best assembly etc but otherwise no, don't do it.
| loeg wrote:
| > Isn't the 'modern' solution to memcpy into a temp and swap
| the bytes in that?
|
| Or just use the endian.h / sys/endian.h routines, which do the
| right thing (be32dec / be32enc / whatever). memcpy+swap is
| fine, and easier to get right than the author's giant
| expressions, but you might as well use the named routines that
| do exactly what you want already.
| kingsuper20 wrote:
| I've never been very satisfied with these approaches for C where
| you hope the compiler does the right thing. It makes sense to
| provide some C implementation for portability's sake but any
| sizeable reordering cries out for a handtuned, processor
| specific, approach (and the non-sizeable probably doesn't require
| high speed). I would expect any SIMD instruction set to include a
| shuffle.
| phkahler wrote:
| It can also be a good idea to swap recursively. First swap the
| upper and lower half, then swap the upper and lower quarters
| (bytes for a 32bit) which can be done with only 2 masks. Then
| if its 64bit value swap alternate bytes, again with only 2
| masks. This can be extended all the way to full bit reverse in
| 3 more lines each with 2 masks and shifts.
| [deleted]
| ipython wrote:
| It's not every day you can write a blog post that calls out rob
| pike... ;)
| jart wrote:
| Author here. I'm improving upon Rob Pike's outstanding work.
| Standing on the shoulders of a giant.
| ipython wrote:
| Totally agree. My comment was made in jest. Mad kudos to you
| as you clearly possess talent and humility that's in short
| supply today.
| bigbillheck wrote:
| I just use ntohl/htonl like a civilized person.
|
| (Yes, the article mentions those, but they've been standard for
| decades).
| froh wrote:
| what's the best practice for 64bit values these days? is htonll
| ntohll widely available yet?
| amelius wrote:
| Byte order is one of the great unnecessary historical fuck ups in
| computing.
|
| A similar one is that signedness of char is machine dependent.
| It's typically signed on Intel and unsigned on ARM.
|
| Sigh!
| mytailorisrich wrote:
| I don't think it's a fuck up, rather I think it was
| unavoidable: Both ways are equally valid and when the time came
| to make the decision, some people decided one way, some people
| decided the other way.
| amelius wrote:
| By the way, mathematicians also have their fuck ups:
|
| https://tauday.com/tau-manifesto
| 8jy89hui wrote:
| For anyone curious or who is still attached to pi, here is a
| response to the tau manifesto:
|
| https://blog.wolfram.com/2015/06/28/2-pi-or-not-2-pi/
| joppy wrote:
| Why is it an issue any more than say, order of fields in a
| struct is an issue? In one case you read bytes off the disk by
| doing ((b[0] << 8) | b[1]) (or equivalent), with the order
| reversed the other way around. Any application-level (say, not
| a compiler, debugger, etc) program should not even need to know
| the native byte order, it should only need to know the encoding
| that the file it's trying to read used.
| zabzonk wrote:
| > order of fields in a struct
|
| This is defined in C to be the order the fields are declared
| in.
| occamrazor wrote:
| But the padding rules between fields are a mess.
| ta_ca wrote:
| the greatest of all is lisp not being the most mainstream
| language, and we can only blame the lisp companies for this
| fiasco. in an ideal world we all would be using a lisp with
| parametric polymorphism. from highest level abstractions to
| machine level, all in one language.
| ta_ca wrote:
| i hope these downvotes are due to my failure at english or
| the comment being off-topic (or both). if not, can i just
| replace lisp with rust and be friends again?
| [deleted]
| bregma wrote:
| And which is the correct byte ordering, pray tell?
| bonzini wrote:
| Little endian has the advantage that you can read the low
| bits of data without having to adjust the address. So you can
| for example do long addition in memory order rather than
| having to go backwards, or (with an appropriate
| representation such as ULEB128) in one pass without knowing
| the size.
| js8 wrote:
| Maybe I am biased working on mainframes, but I would
| personally take big endian over little endian. The reason
| is when reading a hex dump, I can easily read the binary
| integers from left to right.
| bonzini wrote:
| That's the only thing that BE has over LE.
|
| But for example bitmaps in BE are a huge source of bugs,
| as readers and writers need to agree on the size to use
| for memory operations.
|
| "SIMD in a word" (e.g. doing strlen or strcmp with 32- or
| 64-bit memory accesses) might have mostly fallen out of
| fashion these days, but it's also more efficient in LE.
| wongarsu wrote:
| Big and little endian are named after the never-ending "holy"
| war in Gulliver's Travels over how to open eggs. So we were
| always of the opinion that it doesn't really matter. But I
| open my eggs on the little end
| CountHackulus wrote:
| Middle-endian is the only correct answer. It's a tradeoff
| between both little-endian and big-endian. The PDP-11 got it
| right.
| ben509 wrote:
| Yup, we're all waiting for the rest of the world to catch
| up to MM/DD/YYYY.
| rwmj wrote:
| Big Endian of course :-) However the one which has won is
| Little Endian. Even IBM admitted this when it switched the
| default in POWER 7 to little endian. s390x is the only
| significant architecture that is still big endian.
| kstenerud wrote:
| Big endian is easier for humans to read when looking at a
| memory dump, but little endian has many useful features in
| binary encoding schemes due to the low byte being first.
|
| I used to like big endian more, but after deep investigation
| I now prefer little endian for any encoding schemes.
| bombcar wrote:
| Couldn't encoding systems be redone with emphasis on the
| high-order bits? Or is the assumption that the values are
| clustered in the low bits?
| amelius wrote:
| I think the fundamental problems is that if you start a
| computation using N most significant bits and then
| incrementally add more bits, e.g. N+M bits total, then
| your first N bits might change as a result.
|
| E.g. decimal example: 1.00/1.00 = 1.00
| 1.000/1.001 = 0.999000999000...
|
| (adding one more bit changes the first bits of the
| outcome)
| kstenerud wrote:
| You can put emphasis on high order bits, but that makes
| decoding more complex. With little endian the decoder
| builds low to high, which is MUCH easier to deal with,
| especially on spillover.
|
| For example, with ULEB128 [1], you just read 7 bits at a
| time, going higher and higher up the value you're
| reconstituting. If the value grows too big and you need
| to spill over to the next (such as with big integer
| implementations), you just fill the last bits of the old
| value, then put the remainder bits in the next value and
| continue on.
|
| With a big endian encoding method (i.e. VLQ used in MIDI
| format), you start from the high bits and work your way
| down, which is fine until your value spills over. Because
| you only have the high bits decoded at the time of the
| spillover, you now have to start shifting bits along each
| of your already decoded big integer portions until you
| finally decode the lowest bit. This of course gets
| progressively slower as the bits and your big integer
| portions pile up.
|
| Encoding is easier too, since you don't need to check if
| for example a uint64 integer value can be encoded in 1,
| 2, 3, 4, 5, 6, 7 or 8 bits. Just encode the low 8 bits,
| shift the source right by 8, repeat, until the source
| value is 0. Then backtrack to the as-yet-blank encoded
| length field in your message and stuff in how many bytes
| you encoded. You just got the length calculation for
| free. Use a scheme where you only encode up to 60 bit
| values, place the length field in the low 4 bits, and
| Robert's your father's brother!
|
| For data that is right-heavy (i.e. the fully formed data
| always has real data on the right side and blank filler
| on the left - such as uint32 value 8 is actually
| 0x00000008), you want a little endian scheme. For data
| that is left-heavy, you want a big endian scheme. Since
| most of the data we deal with is right-heavy, little
| endian is the way to go.
|
| You can see how this has influenced my encoding design in
| [2] [3] [4].
|
| [1] https://en.wikipedia.org/wiki/LEB128
|
| [2] https://github.com/kstenerud/concise-
| encoding/blob/master/cb...
|
| [3] https://github.com/kstenerud/compact-
| float/blob/master/compa...
|
| [4] https://github.com/kstenerud/compact-
| time/blob/master/compac...
| pantalaimon wrote:
| The good thing is that Big Endian is pretty much irrelevant
| these days. Of all the historically Big Endian architectures,
| s390x is indeed the only one left that has not switched to
| little endian.
| globular-toast wrote:
| Even if all CPUs were little-endian, big-endian would exist
| almost everywhere _except_ CPUs, including in your head.
| Unless you 're some odd person that actually thinks in
| little-endian.
| chrisseaton wrote:
| > The good thing is that Big Endian is pretty much irrelevant
| these days.
|
| This is nonsense - many file formats are big endian.
| benjohnson wrote:
| With a bonus of some being EBCDIC too.
| lanstin wrote:
| This is true.
| erk__ wrote:
| As there was talk about in a subthread yesterday [0] so does
| arm support big endian though it is not used as much anymore
| is it still there.
|
| POWER also still uses big endian though recently little
| endian POWER have gotten more popular
|
| [0]: https://news.ycombinator.com/item?id=27075419
| akvadrako wrote:
| Network protocols still mostly use "Network Byte Order", i.e.
| big endian.
| lanstin wrote:
| Or text. Or handled by generated code like protobuf.
| tssva wrote:
| Network byte order is big endian so it is far from being
| pretty much irrelevant these days.
| BenoitEssiambre wrote:
| Also, this might be irrelevant at the cpu level, but within
| a byte, bits are usually displayed most significant bit
| first, so with little endian you end up with bit order:
|
| 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8
|
| instead of
|
| 15 to 0
|
| This is because little endian is not how humans write
| numbers. For consistency with little endianness we would
| have to switch to writing "one hundred and twenty three" as
|
| 321
| froh wrote:
| that's why little endian == broken endian
|
| said a friend who also quips: "never trust a computer you
| can lift"
| LightMachine wrote:
| Exactly. This is so infuriating. Whoever let little-
| endian win made a huge disfavor for humanity.
| jart wrote:
| Blame the people who failed to localize the right-to-left
| convention when arabic numerals were adopted. It's one of
| those things like pi vs. tau or jacobin weights and
| measurements vs. planck units. Tradition isn't always
| correct. John von Neumann understood that when he
| designed modern architecture and muh hex dump is not an
| argument.
| kstenerud wrote:
| The only benefit to big endian is that it's easier for
| humans to read in a hex dump. Little endian on the other
| hand has many tricks available to it for building
| encoding schemes that are efficient on the decoder side.
| tom_mellior wrote:
| Could you elaborate on these tricks? This sounds
| interesting.
|
| The only thing I'm aware of that's neat in little endian
| is that if you want the low byte (or word or whatever
| suffix) of a number stored at address a, then you can
| simply read a byte from exactly that address. Even if you
| don't know the size of the original number.
| kstenerud wrote:
| I've posted in some other replies, but a few:
|
| - Long addition is possible across very large integers by
| just adding the bytes and keeping track of the carry.
|
| - Encoding variable sized integers is possible through an
| easy algorithm: set aside space in the encoded data for
| the size, then encode the low bits of the value, shift,
| repeat until value = 0. When done, store the number of
| bytes you wrote to the earlier length field. The length
| calculation comes for free.
|
| - Decoding unaligned bits into big integers is easy
| because you just store the leftover bits in the next
| value of the bigint array and keep going. With big
| endian, you're going high bits to low bits, so once you
| pass to more than one element in the bigint array, you
| have to start shifting across multiple elements for every
| piece you decode from then on.
|
| - Storing bit-encoded length fields into structs becomes
| trivial since it's always in the low bit, and you can
| just incrementally build the value low-to-high using the
| previously decoded length field. Super easy and quick
| decoding, without having to prepare specific sized
| destinations.
| mafuy wrote:
| Correct me if I'm wrong, but were the now common numbers
| not imported in the same order from Arabic, which writes
| right to left? So numbers were invented in little endian,
| and we just forgot to translate their order.
| dahart wrote:
| Good question, I just did a little digging to see if I
| could find out. It sounds like old Arabic did indeed use
| little endian in writing and speaking, but modern Arabic
| does not. However, place values weren't invented in
| Arabic, Wikipedia says that occurred in Mesopotamia,
| which spoke primarily Sumerian and was written in
| Cuneiform - where the direction was left to right.
|
| https://en.wikipedia.org/wiki/Number#First_use_of_numbers
|
| https://en.wikipedia.org/wiki/Mesopotamia
|
| https://en.wikipedia.org/wiki/Cuneiform
| gpanders wrote:
| It might not be how humans _write_ numbers but it is
| consistent with how we think about numbers in a base
| system.
|
| 123 = 3x10^0 + 2x10^1 + 1x10^2
|
| So if you were to go and label each digit in 123 with the
| power of 10 it represents, you end up with little endian
| ordering (eg the 3 has index 0 and the 1 has index 2).
| This is why little endian has always made more sense to
| me, personally.
| dahart wrote:
| I always think about _values_ in big endian, largest
| digit first. Scientific notation, for example, since
| often we only care about the first few digits.
|
| I sometimes think about _arithmetic_ in little endian,
| since addition always starts with the least significant
| digit, due to the right-to-left dependency of carrying.
|
| Except lately I've been doing large additions big-endian
| style left-to-right, allowing intermediate "digits" with
| a value greater than 9, and doing the carry pass
| separately after the digit addition pass. It feels easier
| to me to think about addition this way, even though it's
| a less efficient notation.
|
| Long division and modulus are also big-endian operations.
| My favorite CS trick was learning how you can compute any
| arbitrarily sized number mod 7 in your head as fast as
| people are reading the digits of the number, from left to
| right. If you did it little-endian you'd have to remember
| the entire number, but in big endian you can forget each
| digit as soon as you use it.
| BenoitEssiambre wrote:
| I don't know, when we write in general, we tend to write
| the most significant stuff first so you lose less
| information if you stop early. Even numbers we truncate
| twelve millions instead of something like twelve
| millions, zero thousand zero hundreds and 0.
| lanstin wrote:
| Next you are going to want little endian polynomials, and
| that is just too far. Also, the advantage of big endian
| is it naturally extends to decimals/negative exponents
| where the later on things are less important. X squared
| plus x plus three minus one over x plus one over x
| squared etc.
|
| Loss of big endian chips saddens me like the loss of
| underscores in var names in Go Lang. The homogeneity is
| worth something, thanks intel and camelCase, but the old
| order that passes away and is no more had the beauty of a
| new world.
| occamrazor wrote:
| In German _ein hundert drei und zwanzig_, literally _one
| hundred three and twenty_. The hardest part is are
| telephone numbers, that are usually given in blocks of
| two digits.
| lanstin wrote:
| Well that would be hard for me to learn. I always find
| the small numbers between like 10 and 100 or 1000 the
| hardest for me to remember in languages I am trying to
| learn a bit of.
| mrlonglong wrote:
| In an ideal world which endian format would one go for?
| tempodox wrote:
| I for one would go for big-endian, simply because reading
| memory dumps and byte blocks in assembly or elsewhere works
| without mental byte-swapping arithmetics for multi-byte
| entities.
|
| Just out of curiosity, I would be interested in learning why so
| many CPUs today are little-endian. Is it because it is cheaper
| / more efficient for processor implementations or is it because
| "the others do it, so we do it the same way"?
| anticristi wrote:
| My brain is trained to read little-endian in memory dumps.
| It's no different than the German "funf-und-zwanzig" (five
| and twenty). :))
| bombcar wrote:
| https://stackoverflow.com/questions/5185551/why-
| is-x86-littl...
|
| It simplifies certain instructions internally. Practically
| everything is little endian because x86 won.
|
| > And if you think about a serial machine, you have to
| process all the addresses and data one-bit at a time, and the
| rational way to do that is: low-bit to high-bit because
| that's the way that carry would propagate. So it means that
| [in] the jump instruction itself, the way the 14-bit address
| would be put in a serial machine is bit-backwards, as you
| look at it, because that's the way you'd want to process it.
| Well, we were gonna built a byte-parallel machine, not bit-
| serial and our compromise (in the spirit of the customer and
| just for him), we put the bytes in backwards. We put the low-
| byte [first] and then the high-byte. This has since been
| dubbed "Little Endian" format and it's sort of contrary to
| what you'd think would be natural. Well, we did it for
| Datapoint. As you'll see, they never did use the [8008] chip
| and so it was in some sense "a mistake", but that [Little
| Endian format] has lived on to the 8080 and 8086 and [is] one
| of the marks of this family.
| mrlonglong wrote:
| And does middle endian even exist?
| FabHK wrote:
| US date format: 12/31/2021
| bitwize wrote:
| Little endian. There is no extant big-endian CPU that matters.
| mrlonglong wrote:
| I did say in an ideal world.
| bitwize wrote:
| Hint: The reason why it's called "endianness" comes from
| the novel _Gulliver 's Travels_, in which the neighboring
| nations of Lilliput and Blefuscu went to bitter, bloody war
| over which end to break your eggs from: the big end or the
| little end. The warring factions were also known as Big-
| Endians and Little-Endians, and each thought themselves
| superior to the dirty heathens on the other side. If one
| side were objectively correct, if there were an inherent
| advantage to breaking your egg from one side or the other,
| would there be a war at all?
| dragonwriter wrote:
| > if there were an inherent advantage to breaking your
| egg from one side or the other, would there be a war at
| all?
|
| Fascism vs. not-fascism, Stalinist Communism vs. Western
| Capitalism, Islamism vs. liberal democracy... I'm not
| sure "the existence of war around a divide in ideas
| proves that neither sides ideas are correct" is a
| particularly comfortable maxim to consider the
| ramifications of.
| enqk wrote:
| https://fgiesen.wordpress.com/2014/10/25/little-endian-vs-bi...
| marcosdumay wrote:
| Why would one choose the memory representation of the number
| based on the advantages of the internal ALU wiring?
|
| Of all those reasons, the only one I can make sense of is the
| "I can't transparently widen fields after the fact!", and
| that one is way too niche to explain anything.
| enqk wrote:
| I don't understand? Why not make the memory representation
| sympathetic with the operations you're going to do on it?
| It's the raison d'etre of computers to compute and to do it
| fast.
|
| Another example: memory representation of pixels in GPUs
| which are swizzled to make computations efficient
| marcosdumay wrote:
| > I don't understand? Why not make the memory
| representation sympathetic with the operations you're
| going to do on it?
|
| There's no reason to, as there's no reason not to. It's
| basically irrelevant.
|
| If carrier passing is so important, why can't you just
| mirror your transistors and operate on the same wires,
| but on the opposite order? Well, you can, and it's
| trivial. (And, by the way, carrier passing isn't
| important. High performances ALU pass carrier only though
| blocks, that can appear anywhere. And the wiring of those
| isn't even planar, so how you arrange them isn't a
| showstopper.)
| cygx wrote:
| _> So the solution is simple right? Let 's just use unsigned char
| instead. Sadly no. Because unsigned char in C expressions gets
| type promoted to the signed type int._
|
| If you do use _unsigned char_ , an alternative to masking would
| be performing the cast to _uint32_t_ before instead of after the
| shift.
|
| _edit:_ For reference, this is what it would look like when
| implemented as a function instead of a macro:
| static inline uint32_t read32be(const uint8_t *p) {
| return (uint32_t)p[0] << 24 | (uint32_t)p[1] <<
| 16 | (uint32_t)p[2] << 8 |
| (uint32_t)p[3]; }
| jcadam wrote:
| A while back I was on a project to port a satellite simulator
| from SPARC/Solaris to RHEL/x64. The compressed telemetry stream
| that came from the satellite needed to be in big endian (and
| that's what the ground station software expected), and the
| simulator needed to mimic the behavior.
|
| This was not a problem for the old SPARC system, which naturally
| put everything in the correct order without any fuss, but one of
| the biggest sticking points in porting over to x64 was having to
| now manually pack all of that binary data. Using Ada, (what
| else!) of course.
| metiscus wrote:
| If memory serves correctly, ada 2012 and beyond has language
| level support for this. I was working on porting some code from
| an aviation platform to run on PC and it was all in ada 2005 so
| we didn't have the benefit of that available.
| jcadam wrote:
| Same here, Ada2005 for the port. The simulator was originally
| written in Ada95. Part of what made it even less fun was the
| data was highly packed and individual fields crossed byte
| boundaries (these 5 bits are X, the next 4 bits are Y, etc.)
| :(
| bombcar wrote:
| Given enough memory it may be worth treating the whole
| stream internally as a bitstream.
| onox wrote:
| Couldn't you add the Bit_Order and Scalar_Storage_Order
| attributes (or aspects in Ada 2012) to your records/arrays?
| Or did Scalar_Storage_Order not exist at the time?
| the_real_sparky wrote:
| This problem is it's own special horror in Canbus data. Between
| endianness and sign it's a nightmare of en/decoding possibilities
| and the associated mistakes that come with that.
| rwmj wrote:
| TIFF is another one. The only endian-switchable image format
| that I'm aware of.
|
| Fun fact: CD-ROM superblocks have both-endian fields. Each
| integer is stored twice in big and little endian format. I
| assume this was to allow underpowered 80s hardware which didn't
| have enough resource to do byte swapping.
| gumby wrote:
| In her first sentence, the phrase "the C / C++ programming
| language" is no longer correct: C++20 requires two's complement
| signed integers.
|
| C++ 20 is quite new so I would assume that very few people know
| this yet.
|
| C and C++ obviously differ a lot, but by that phrase she clearly
| means "the part where then two languages overlap". The C++
| committee has been willing to break C compatibility in a few ways
| (not every valid C program is a valid C++ program), and this has
| been true for a while.
| loeg wrote:
| It hasn't been true since C99, at least -- C++ didn't adopt C99
| designated initializers.
| hctaw wrote:
| What chips can be targeted by C compilers today that don't use
| 2's complement?
| gumby wrote:
| I haven't seen a one's complement machine in decades but at
| the time C was standardized here were still quite a few
| (afaik none had a single-chip CPU, to get to your question).
| But since they existed, the language definition didn't
| require it and some optimizations were technically UB.
|
| The C++ committee decided that everyone had figured this out
| by now and so made this breaking change.
| klyrs wrote:
| "the c/c++ language" exists insofar as you can import this c
| code into your c++, and this is something that c++ programmers
| need to know how to do, so they'd better learn enough of the
| differences between c and c++ or they'll be stumped when they
| crack open somebody else's old code.
| mitchs wrote:
| Or just cast the pointer to uint##_t and use be##toh and htobe##
| from <endian.h>? I think this is making a mountain out of a mole
| hill. I've spent tons of time doing wire (de)serialization in C
| for network protocols and endian swaps are far from the most
| pressing issue I see. The big problem imo is the unsafe practices
| around buffer handling allowing buffer over runs.
| amluto wrote:
| Why mask and then shift instead of casting to the correct type
| and then shifting, like this: (uint32_t)x[0] <<
| 24 | ...
|
| Of course, this requires that x[0] be unsigned.
| syockit wrote:
| If this is for deserialisation then it's okay for x[0] to be
| signed. You just need to recast the result as int32_t (or
| simply assign to an int32_t variable without any cast) and it
| is not UB.
| baby wrote:
| I suggest this article:
| https://www.cryptologie.net/article/474/bits-and-bytes-order...
| (shameless plug)
| tails4e wrote:
| Ubsan should default on. If people don't like it, then they
| should be made turn it off with a switch, so at least it's more
| likely to be run than not run. Could save a huge amount of time
| debugging when compilers or architecture changes. Without it, I'd
| say many a programmer would be caught by these subtleties in the
| standard. Coming from a HW background (Verilog) I'd more
| naturally default to masking and shifting when building up larger
| variables from smaller ones, but I can imagine many would not.
| patrakov wrote:
| There was a blog post and a FOSDEM presentation by (misguided)
| Gentoo developers a few years ago, and it was retracted,
| because sanitizers add their own exploitable vulnerabilities
| due to the way they work.
|
| https://blog.hboeck.de/archives/879-Safer-use-of-C-code-runn...
|
| https://www.openwall.com/lists/oss-security/2016/02/17/9
| jart wrote:
| Sanitizers have the ability to bring Rust-like safety
| assurances to all the C/C++ code that exists. The fact that
| existing ASAN runtimes weren't designed for setuid binaries
| shouldn't dissuade us from pursuing those benefits. We just
| need a production-worthy runtime that does less things. For
| example, here's the ASAN runtime that's used for the redbean
| web server: https://github.com/jart/cosmopolitan/blob/master/
| libc/intrin...
| pornel wrote:
| Run-time detection and heuristics on a language that is
| hard to analyze (e.g. due to weak aliasing, useless const,
| ad-hoc ownership and thread-safety rules) aren't in the
| same ballpark as compile-time safety guaranteed by
| construction, and an entire modern ecosystem centered
| around safety. Rust can use LLVM sanitizers in addition to
| its own checks, so that's not even a trade-off.
| tails4e wrote:
| Sorry for my ignorance, but surely some UB being used for
| optimization by the compiler is compile time only. This is
| the part that should default on. Runtime detection is a
| different thing entirely, but compile time is a no brainer.
| MauranKilom wrote:
| > Ubsan should default on
|
| > Could save a huge amount of time debugging when compilers or
| architecture changes.
|
| I'm assuming we come from very different backgrounds, but it's
| not clear to me how switching compilers or _architectures_ is
| so common that hardening code against it _by default_ is
| appropriate. I would think that switching compilers or
| architectures is generally done very deliberately, so
| instrumenting code with UBsan _for that transition_ would be
| the right thing to do?
| toast0 wrote:
| Changing compilers is a pretty regular thing IMHO; I use the
| compiler that comes with the OS and let's assume a yearly OS
| release cycle. Most of those will contain at least some
| changes to the compiler.
|
| I don't really want to have to take that yearly update to go
| through and review (and presumablu fix) all the UB that has
| managed to sneak in over the year. It would be better to have
| avoided putting it in.
| tails4e wrote:
| Changing gcc version could cause your code with undefined
| behaviour to change. If you rely UB, whether you know you are
| or not, you are in for a bad time. Ubsan at least let's you
| know if your code is robust, or a ticking time bomb...
| jedisct1 wrote:
| Sanitizers may introduce side channels. This is an issue for
| crypto code.
| rwmj wrote:
| If you can assume GCC or Clang then __builtin_bswap{16,32,64}
| functions are provided which will be considerably more efficient,
| less error-prone, and easier to use than anything you can
| homebrew.
| dataflow wrote:
| _byteswap_{ushort,ulong,uint64} for MSVC. Together with yours
| on x86 these should take care of the three major compilers.
| st_goliath wrote:
| Well, yes. The only thing missing is knowing if you have to
| swap or not, if you don't want to assume your code will run on
| little endian systems exclusively.
|
| Or, on Linux and BSD systems at least, you can use the
| <endian.h> or <sys/endian.h> functions
| (https://linux.die.net/man/3/endian) and rely on the libc
| implementation to do the system/compiler detection for you and
| use an appropriate compiler builtin inside of an inline
| function instead of bothering to hack something together in
| your own code.
|
| The article mentions those functions at the bottom, but
| strangely still recommends hacking up your own macros.
| jart wrote:
| That's not true. If you write the byte swap in ANSI C using the
| gigantic mask+shift expression it'll optimize down to the bswap
| instruction under both GCC and Clang, as the blog post points
| out.
| rwmj wrote:
| Assuming the macros or your giant expression are correct. But
| you might as well use the compiler intrinsics which you
| _know_ are both correct and the most efficient possible, and
| get on with your life.
| jart wrote:
| Sorry I'd rather place my faith in arithmetic rather than
| someone's API provided the compiler is smart enough to
| understand the arithmetic and optimize accordingly.
| loeg wrote:
| "Someone" here is the same compiler you're trusting to
| optimize your giant arithmetic expression of the same
| idea. Your statement is internally inconsistent.
| lanstin wrote:
| There is a value to keeping it completely clear in your
| head the difference between a value with arithmetic
| semantics vs a value with octets in a stream semantics.
| That thinking will work in all contexts, while the
| compiler knowledge is limited. The thinking will help you
| write correct ways to encode data in the URL or into a
| file being uploaded that your code generates for discord
| or whatever, in Python, without knowledge of the true
| endianness of the system the code is running on.
| [deleted]
| borman wrote:
| Funny that compilers (e.g. clang:
| https://github.com/llvm/llvm-
| project/blob/b04148f77713c92ee5... ) might be able to do that
| only because someone on the compiler team has hand-coded a
| bswap expression detector.
| bombcar wrote:
| Given it can be done with careful code AND many processors have
| a single instruction to do it I'm surprised it hasn't been
| added to the C standard.
| savant2 wrote:
| The article explicitly shows that the provided macros are very
| efficient with a modern compiler. You can check on godbolt.org
| that they emit the same code.
|
| Though the article only mentions bswap64 and mentioning
| __builtin_bswap64 would be a nice addition.
| fanf2 wrote:
| But then you have to #ifdef the endianness of the target
| architecture. If you do it the right way as Russ Cox and
| Justine Tunney say, then your code can serialize and
| deserialize correctly regardless of the platform endianness.
| chrisseaton wrote:
| __builtin_bswap does exactly the same thing as the macros.
| russdill wrote:
| The fallacy in the article is that anyone should code these
| functions. There's plenty of public domain libraries that do
| this correctly.
|
| https://github.com/rustyrussell/ccan/blob/master/ccan/endian...
| nly wrote:
| My favourite builtins are the overflow checked integer
| operations:
|
| https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins...
| captainmuon wrote:
| It is a ridiculous feature of modern C that you have to write the
| super verbose "mask and shift" code, which then gets compiled to
| a simple `mov` and maybe a `bswap`. Wheras, the direct equivalent
| in C, an assignment with a (type changing) cast, is illegal.
| There is a huge mismatch between the assumptions of the C spec
| and actual machine code.
|
| One of the few reasons I ever even reached to C is the ability to
| slurp in data and reinterpret it as a struct, or the ability to
| reason in which registers things will show up and mix in some
| `asm` with my C.
|
| I think there should really be a dialect of C(++) where the
| machine model is exactly the physical machine. That doesn't mean
| the compiler can't do optimizations, but it shouldn't do things
| like prove code as UB and fold everything to a no-op. (Like when
| you defensively compare a pointer to NULL that according to spec
| must not be NULL, but practically could be...)
|
| `-fno-strict-overflow -fno-strict-aliasing -fno-delete-null-
| pointer-checks` gets you halfway there, but it would really only
| be viable if you had a blessed `-std=high-level-assembler` or
| `-std=friendly-c` flag.
| MrBuddyCasino wrote:
| > There is a huge mismatch between the assumptions of the C
| spec and actual machine code.
|
| People like to say ,,C is close to the metal". Really not true
| at all anymore.
| goldenkey wrote:
| Actually, it is true - which is why endian is a problem in
| the first place. ASM code is different when written for
| little endian vs big endian. Access patterns are positively
| offset instead of negatively.
|
| A language that does the same things regardless of endianness
| would not have pointer arithmetic. That is not ASM and not C.
| pjmlp wrote:
| It does, macro assemblers, specially those with PC and Amiga
| roots.
|
| Which given its heritage, that is what PDP-11 C used to be,
| after all BCPL origin was as minimal language required to
| bootstrap CPL, nothing else.
|
| Actually, I think TI has a macro Assembler with a C like
| syntax, just cannot recall the name any longer.
| simias wrote:
| > _Wheras, the direct equivalent in C, an assignment with a
| (type changing) cast, is illegal._
|
| I don't understand what you mean by that. The direct equivalent
| of what? Endianess is not part of the type system in C so I'm
| not sure I follow.
|
| > _I think there should really be a dialect of C(++) where the
| machine model is exactly the physical machine._
|
| Linus agrees with you here, and I disagree with both of you.
| _Some_ UBs could certainly be relaxed, but as a rule I want my
| code to be portable and for the compiler to have enough leeway
| to correctly optimize my code for different targets without
| having to tweak my code.
|
| I want strict aliasing and I want the compiler to delete
| extraneous NULL pointer checks. Strict overflow I'm willing to
| concede, at the very least the standard should mandate wrap-on-
| overflow ever for signed integers IMO.
| lanstin wrote:
| I am sympathetic, but portability was more important in the
| past and gets less important each year. I used to write code
| strictly keeping the difference between numeric types and
| sequences of bytes in mind, hoping to one day run on an Alpha
| or a Tandem or something, but it has been a long time since I
| have written code that runs on non-(Intel AMD or le ARM)
| mhh__ wrote:
| D's machine model does actually assume the hardware, and using
| the compile time metaprogramming you can pretty much do
| whatever you want when it comes to bit twiddling - whether that
| means assembly, flags etc.
| pornel wrote:
| Of course nobody wants C to backstab them with UB, but at the
| same time programmers want compilers to generate optimal code.
| That's the market pressure that forces optimizers to be so
| aggressive. If you can accept less optimized code, why aren't
| you using tcc?
|
| The idea of C that "just" does a straightforward machine
| translation breaks down almost immediately. For example, you'd
| want `int` to just overflow instead of being UB. But then it
| turns out indexing `arr[i]` can't use 64-bit memory addressing
| modes, because they don't overflow like a 32-bit int does. With
| UB it doesn't matter, but a "straightforward C" would emit
| unnecessary separate 32-bit mul/shift instructions.
|
| https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759...
| MaxBarraclough wrote:
| > nobody wants C to backstab them with UB, but at the same
| time programmers want compilers to generate optimal code
|
| The value of compiler optimization isn't the same thing as
| the value of having extensive undefined behaviour in a
| programming language.
|
| Rust and Ada perform about the same as C, but lack C's many
| footguns.
|
| > indexing `arr[i]` can't use 64-bit memory addressing modes
|
| What do you mean here?
| remexre wrote:
| Typically, the assembly instruction that would do the read
| in arr[i] can do something like: x = *(y
| + z);
|
| where y and z are both 64-bit integers. If I had
| int arr[1000]; initialize(&arr); int i =
| read_int(); int x = arr[i]; print(x);
|
| then to get x I'd need to do something like,
| tmp = i * 4; tmp1 = (uint64_t)tmp; x =
| *(arr + tmp1);
|
| Which, since i is signed, can't just be a cheap shift, and
| then needs to be upcasted to a uint64_t (which is cheap, at
| least).
| ajross wrote:
| > There is a huge mismatch between the assumptions of the C
| spec and actual machine code.
|
| Right, which is why the kind of UB pedantry in the linked
| article is hurting and not helping. Cranky old man perspective
| here:
|
| Folks: the fact that compilers will routinely exploit edge
| cases in undefined behavior in the language specification to
| miscompile obvious idiomatic code is a _terrible bug in the
| compilers_. Period. And we should address that by fixing the
| compilers, potentially by amending the spec if feasible.
|
| But instead the community wants to all look smart by showing
| how much they understand about "UB" with blog posts and (worse)
| drive-by submissions to open source projects (with passive
| agressive sneers about code quality), so nothing gets better.
|
| Seriously: don't tell people to shift and mask. Don't
| pontificate over compiler flags. Stop the masturbatory use of
| ubsan (though the tool itself is great). And start submitting
| bugs against the toolchain to get this fixed.
| wnoise wrote:
| I read this, and go "yes, yes, yes", and then "NO!".
|
| Shifts and ors really is the sanest and simplest way to
| express "assembling an integer from bytes". Masking is _a_
| way to deal with the current C spec which has silly promotion
| rules. Unsigned everything is more fundamental than signed.
| jart wrote:
| I agree but language of the standard very unambiguously lets
| them do it. Quoth X3.159-1988 * Undefined
| behavior --- behavior, upon use of a nonportable or
| erroneous program construct, of erroneous data, or of
| indeterminately-valued objects, for which the Standard
| imposes no requirements. Permissible undefined
| behavior ranges from ignoring the situation
| completely with unpredictable results, to behaving during
| translation or program execution in a documented manner
| characteristic of the environment (with or without
| the issuance of a diagnostic message), to
| terminating a translation or execution (with the issuance
| of a diagnostic message).
|
| In the past compilers "behaved during translation or program
| execution in a documented manner characteristic of the
| environment" and now they've decided to "ignore the situation
| completely with unpredictable results". So yes what gcc and
| clang are doing is hostile and dangerous, but it's legal.
| https://justine.lol/undefined.png So let's fix our code. The
| blog post is intended to help people do that.
| userbinator wrote:
| _So let 's fix our code._
|
| No; I say we force the _compiler writers_ to fix their
| idiotic assumptions instead of bending over backwards to
| please what 's essentially a tiny minority. There's a lot
| more programmers who are not compiler writers.
|
| The standard is really a _minimum bar_ to meet, and what 's
| not defined by it is left to the discretion of the
| implementers, who should be doing their best to follow the
| "spirit of C", which ultimately means behaving sanely. "But
| the standard allows it" should never be a valid argument
| --- the standard allows a lot of other things, not all of
| which make sense.
|
| A related rant by Linus Torvalds:
| https://bugzilla.redhat.com/show_bug.cgi?id=638477#c129
| cbmuser wrote:
| > One of the few reasons I ever even reached to C is the
| ability to slurp in data and reinterpret it as a struct, or the
| ability to reason in which registers things will show up and
| mix in some `asm` with my C.
|
| Which results in undefined behavior according to the C ISO
| standard.
|
| Quote:
|
| "2 All declarations that refer to the same object or function
| shall have compatible type; otherwise, the behavior is
| undefined."
|
| From: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
| 6.2.7
| [deleted]
| jstanley wrote:
| Exactly.
| innocenat wrote:
| How? I mean, doesn't GP mean this? struct
| whatever p; fread(p, sizeof(p), 1, fp);
| tsimionescu wrote:
| It should be perfectly fine to do this: union
| reinterpret { char raw[100]; struct myStruct
| interpreted; } example; read(fd,
| &example.raw) struct myStruct dest = interpreted;
|
| This is standard-compliant C code, and it is a common way of
| reading IP addresses from packets, for example.
| saagarjha wrote:
| (It should be noted that this is not valid C++ code.)
| nine_k wrote:
| I suspect you might like C--.
|
| https://en.m.wikipedia.org/wiki/C--
| froh wrote:
| you could instead simply use hton/ntoh and trust the library
| properly does The Right Thing tm
| nly wrote:
| > I think there should really be a dialect of C(++) where the
| machine model is exactly the physical machine.
|
| Sounds great, until you have to rewrite all your software to go
| from x86-64 to ARM
| pjmlp wrote:
| Quite common when coding games back in the 8 and 16 bit days.
| :)
|
| However for the case in hand, it would suffice to just write
| the key routines in Assembly, not everything.
| pm215 wrote:
| So in your 'machine model is the physical machine' flavour,
| should "I cast an unaligned pointer to a byte array to int32_t
| and deref" on SPARC (a) do a bunch of byte-load-and-shift-and-
| OR or (b) emit a simple word load which segfaults? If the
| former, it's not what the physical machine does, and if the
| latter, then you still need to write the code as "some portable
| other thing". Which is to say that the spec's UB here is in
| service of "allow the compiler to just emit a word load when
| you write *(int32_t)p".
|
| What I think the language is missing is a way to clearly write
| "this might be unaligned and/or wrong endianness, handle that".
| (Sometimes compilers provide intrinsics for this sort of gap,
| as they do with popcount and count-leading-zeroes; sometimes
| they recognize common open-coded idioms. But proper
| standardised support would be nicer.)
| jart wrote:
| Endianness doesn't matter though, for the reasons Rob Pike
| explained. For example, the bits inside each byte have an
| endianness probably inside the CPU but they're not
| addressable so no one thinks about that. The brilliance of
| Rob Pike's recommendation is that it allows our code to be
| byte order agnostic for the same reasons our code is already
| bit order agnostic.
|
| I agree about bsf/bsr/popcnt. I wish ASCII had more
| punctuation marks because those operations are as fundamental
| as xor/and/or/shl/shr/sar.
| klodolph wrote:
| You don't have to mask and shift. You can memcpy and then byte
| swap in a function. It will get inlined as mov/bswap.
|
| Practically speaking, common compilers have intrinsics for
| bswap. The memcpy function can be thought of as an intrinsic
| for unaligned load/store.
| BeeOnRope wrote:
| How do you detect if a byte swap is needed? I.e. wether the
| (fixed) wire endianness matches the current platform
| endianness?
| edflsafoiewq wrote:
| Ie how do you know the target's endianness? C++20 added
| std::endian. Otherwise you can use a macro like this one
| from SDL
|
| https://github.com/libsdl-
| org/SDL/blob/9dc97afa7190aca5bdf92...
| hermitdev wrote:
| There have been CPU architectures where the endianness at
| compile time isn't necessarily sufficient. I forget
| which, maybe it was DEC Alpha, where the CPU could flip
| back and forth? I can't recall if it was a "choose at
| boot" or a per process change.
| magicalhippo wrote:
| ARM allows dynamic changing of endianess[1].
|
| [1]:
| https://developer.arm.com/documentation/dui0489/h/arm-
| and-th...
| user-the-name wrote:
| When do you byte swap?
| themulticaster wrote:
| The first example in the article is flawed (or at least
| misleading).
|
| 1) They define a char array (which defaults to signed char, as
| mentioned in the post), including the value 0x80 which can't be
| represented in char, resulting in a compiler warning (e.g. in GCC
| 11.1).
|
| The mentioned reason against using unsigned char (that shifting
| 128 left by 24 places results in UB) is also misleading: I could
| not reproduce the UB when changing the array to unsigned char.
| Perhaps the author meant leaving the array defined as signed
| char, but casting the signed chars to unsigned before shifting.
| That indeed results in UB, but I don't see why you would define
| the array as signed in the first place.
|
| 2) The cause for the undefined behavior isn't the bswap_32,
| rather it's because they try reading an uint32_t value from a
| char array, where b[0] is not aligned on a word boundary.
|
| There is no need at all do redefine bswap. The simple solution
| would be to use an unsigned char array instead of a char array
| and just reading the values byte-wise.
|
| Of course C has its footguns and warts and so on, but there is no
| need to dramatize it this much in my opinion.
|
| I've prepared a Godbolt example to better explain the arguments
| mentioned above: https://godbolt.org/z/Y1EWK6e17
|
| Edit: To add to point 2) above: Another way to avoid the UB (in
| this specific case) would be to add __attribute__ ((aligned (4)))
| to the definition of b. In that case, even reading the array as a
| single uint32_t works as expected since the access is aligned to
| a word boundary.
|
| Obviously, you can't expect any random (unsigned char) pointer to
| be aligned on a word boundary. Therefore, it is still necessary
| to read the uint32_t byte by byte.
| cygx wrote:
| _> The mentioned reason against using unsigned char (that
| shifting 128 left by 24 places results in UB) is also
| misleading_
|
| No, that reasoning is correct. Integer promotions are performed
| on the operands of a shift expression, meaning the left operand
| will be promoted to signed int even if it starts out as
| unsigned char. Trying to shift a byte value with highest bit
| set by 24 will results in a value not representable as signed
| int, leading to UB.
| themulticaster wrote:
| Thanks, I just noticed a small mistake in my example (I don't
| trigger the UB because I access b[0] containing 0x80 without
| shifting, however I meant to do it the other way around).
|
| Still, adding an explicit cast to the left operand seems to
| be enough to avoid this, e.g.: uint32_t x =
| ((uint32_t)b[0]) << 24;
|
| In summary, I think my point that using unsigned char would
| be appropriate in this case still stands.
| cygx wrote:
| _> Still, adding an explicit cast to the left operand seems
| to be enough to avoid this_
|
| Indeed. See my other comment,
| https://news.ycombinator.com/item?id=27086482
| commandersaki wrote:
| It wasn't clear to me but what was the undefined behaviour in the
| naive approach?
| cygx wrote:
| Violation of the effective typing rules ('strict aliasing') and
| a potential violation of alignment requirements of your
| platform.
| [deleted]
___________________________________________________________________
(page generated 2021-05-08 23:00 UTC)