[HN Gopher] Parsing time stamps faster with SIMD instructions
       ___________________________________________________________________
        
       Parsing time stamps faster with SIMD instructions
        
       Author : mfiguiere
       Score  : 179 points
       Date   : 2023-07-02 14:50 UTC (8 hours ago)
        
 (HTM) web link (lemire.me)
 (TXT) w3m dump (lemire.me)
        
       | victor82 wrote:
       | The solution exposed don't seems competitive enough to solve with
       | good rank the problem exposed here https://highload.fun/tasks/14
       | 
       | That site is really good to play with SIMD code.
        
         | reaperman wrote:
         | How can I see Yuriy's top solution for this task?
        
           | anonymoushn wrote:
           | You can ask him but he probably won't share
        
             | reaperman wrote:
             | Kind of a weird leaderboard if you can't even check to see
             | if answers satisfy correctness. I've seen plenty of
             | leaderboards where the top answers would fail cases that
             | weren't covered by the leaderboards testing.
        
               | paulmd wrote:
               | Yeah, most of the leetcode top solutions are a hardcoded
               | table of answers for the scoring questions followed by
               | the general solution.
        
               | anonymoushn wrote:
               | I think it's not such a big deal because the input
               | generators are usually pretty good and because most
               | solutions that give up some correctness for performance
               | can pretty cheaply detect that they done goofed and fall
               | back to a slow path where they start over and do it some
               | other way. This ends up being useful because scoring is
               | the median runtime of 3 successful runs (so if you fail
               | one of the 3 then your submission fails completely). It
               | also means that probabilistic solutions that would be
               | correct in non-adversarial environments (like using a
               | bloom filter or N bloom filters tuned to give an exact
               | result given the input constraints with probability
               | >99.9%) are admissible.
        
         | victor82 wrote:
         | The solution codes are not available as you can continuously
         | update your solution.
        
           | pvitz wrote:
           | What is actually the task here? Parsing is clear and it looks
           | like one has to sum all parsed timestamps, but what does the
           | additional comment about the shortest time mean?
        
             | anonymoushn wrote:
             | It means that your solution is scored on how fast it runs.
             | 
             | Frequently we end up with solutions that use tricks that
             | would not be applicable to the n=1 problem in TFA ("parse 1
             | timestamp") when we try to get the fastest solutions for
             | "find the sum of all these timestamps."
        
         | renewiltord wrote:
         | Is the code for some of the solutions available? Thanks for the
         | link. That's a fun place.
        
       | jeffbee wrote:
       | I don't think I have ever seen %Y%m%d%H%M%S used in real life.
        
         | wood_spirit wrote:
         | in datasets that I've worked with it is so incredibly common
         | and so incredibly slow that I've previously made a fast library
         | to work with them: https://github.com/williame/TimeMillis
         | 
         | Not as fast as SIMD, perhaps, but much faster than the Java
         | standard library...
        
           | jeffbee wrote:
           | "The TimeMillis library offers helper functions to read,
           | write and operate on timestamps stored as milliseconds in
           | UNIX epoch."
           | 
           | That doesn't seem related?
        
             | gliptic wrote:
             | The parse function takes a string in the format mentioned.
        
         | tuukkah wrote:
         | Sweden uses "%Y-%m-%d %H:%M:%S" - removal of the separators
         | using SIMD left as an exercise ;-)
         | https://en.wikipedia.org/wiki/Date_and_time_notation_in_Swed...
        
           | GordonS wrote:
           | China does too, and IIRC Japan and South Korea.
        
         | tjoff wrote:
         | You mean without delimiters, or? Stripping superfluous
         | delimiters in logs isn't unusal.
        
         | vardump wrote:
         | That's a very common format.
        
           | [deleted]
        
           | morelisp wrote:
           | Other than RRSIG (which I would argue is not at all common),
           | where have you seen it?
        
             | masklinn wrote:
             | Timestamped filenames.
        
               | morelisp wrote:
               | Whenever I've seen it on timestamped filenames I've also
               | needed sub-second precision.
               | 
               | If that's a performance concern, a (potentially zero-
               | padded, but realistically you don't even needed that)
               | Unix timestamp works just as well.
        
               | wongarsu wrote:
               | I've seen lots of %Y%m%d on filenames timestamped by
               | humans.
               | 
               | People timestamp stuff for different reasons. If you do
               | it to avoid collisions then adding sub-second precision
               | makes sense, but in many contexts second or even day
               | precision is all you care about.
               | 
               | The important point is that %Y%m%d%H%M%S is human
               | parsable and still sorts correctly (though humans
               | typically prefer %Y-%m-%d %H:%M:%S for readability)
        
             | horse666 wrote:
             | [dead]
        
         | dahart wrote:
         | Nothing wrong with not having seen it, but I default to time
         | stamps like this for log files, I use it all the time in
         | scripts. I even also do it manually and have an alias in my
         | .bashrc I call 'now' that spits out a call to date with that
         | format string. This is, obviously, useful for organizing and
         | sorting and finding files & folders by date & time. (And,
         | hopefully obviously, creation & mod times on files are
         | completely unreliable for this once you factor in copying &
         | archiving.)
        
       | lyu07282 wrote:
       | This is really cool, I recently wanted to implement a ARGB->RGBA
       | conversion function with SIMD in Rust, but as far as I can tell
       | SIMD is STILL not in stable Rust :(
        
         | mirsadm wrote:
         | You won't get much speed up using SIMD for such a simple task.
         | You'll be memory bandwidth limited.
        
         | lpghatguy wrote:
         | The CPU-specific intrinsic stabilized a little while ago for
         | cases where you need SIMD on a specific platform at least. A
         | lot of crates like glam seem to only support SIMD on x86/x86-64
         | for that reason.
         | 
         | I know that `wide`, `faster`, and `simdeez` are options for
         | crates to do cross-platform SIMD in safe Rust. Eventually we'll
         | have `std::simd`, but I don't know what the timeline is or what
         | the blockers are.
        
         | ComputerGuru wrote:
         | SIMD _is_ stable in rust, it just requires unsafe.
         | 
         | Here's an article I wrote about using SIMD in rust, with a
         | real-world example: https://neosmart.net/blog/using-simd-
         | acceleration-in-rust-to...
        
       | mabbo wrote:
       | How hard would it be to modify this to work with ISO-8601, the
       | international time stamp format?
       | 
       | In the most common case, you'd have something like
       | 2023-07-02T15:46:00Z possibly with a ".123" been the last "0" and
       | "Z".
       | 
       | Would it be fastest to just copy the relevant chars to the
       | correct place for the code in the post? Or could you do it in
       | place somehow?
        
         | zokier wrote:
         | Shouldn't be too hard, assuming you don't need any validation.
         | There afaik are simd instructions to move bytes around (vpermb
         | maybe?), so that should take care of getting rid of the
         | separators
        
         | anonymoushn wrote:
         | The code in the post doesn't seem to finish the job. For a
         | timestamp like "2023-07-02T15:46:00Z" or the other formats you
         | mentioned, the timestamp fits in a single 32-byte register, so
         | you could write code that deals with the stuff as it is. If you
         | have a mix of timestamps with different numbers of decimals or
         | with different time zones, dealing with those things will take
         | a bit of work compered to the ideal case where you had all "Z"
         | or all "+" or all "-" for the time zones and all the same
         | number of decimals for the fractional seconds.
        
       | zokier wrote:
       | > We know that some character must be smaller than 9, for
       | example, we cannot have more than 59 seconds and never 60
       | seconds, in the time stamp string.
       | 
       | This is how you get leap seconds bugs?
        
         | ryandrake wrote:
         | Yea, my eye twitched when I read that and I said to myself "so,
         | who's gonna tell him?" Glad to see this is the top comment.
         | This is why we should not roll our own anything.
        
           | habitue wrote:
           | > This is why we should not roll our own anything.
           | 
           | This seems like a major overcorrection. It's equivalent to
           | saying "you can't know anything" "no one is qualified to do
           | anything"
           | 
           | You really have to consider the downsides. In cryptography,
           | "don't roll you own crypto" is the maxim because the downside
           | is pretty bad. Make sure you're an expert before writing
           | crypto.
           | 
           | What's the downside to mis-parsing leap seconds? Maybe a 500
           | on a super rare request? Then they patch it and it's fine?
           | Most computers are off by more than 1 second anyway.
           | 
           | I think this mindset comes from developers looking into some
           | field and realizing it's super deep ("Did you know there are
           | missing dates in the gregorian calendar if you go back
           | hundreds of years? Your time library is incorrect if it
           | doesn't handle these cases!"). And sure, most things aren't
           | completely correct, but you have to consider what the
           | downside is to being slightly wrong. Maybe it'll be wrong in
           | cases that will never actually happen in practice.
        
           | simlevesque wrote:
           | > This is why we should not roll our own anything.
           | 
           | How is anything going to be made then ?
           | 
           | Also, the author isn't just anybody and has created stuff
           | that you've most likely interacted with. If almost nobody
           | should "roll his own", he's one of those who should.
        
         | dataflow wrote:
         | I believe if you're using epoch timestamps you already have
         | leap seconds bugs. I don't think they increment/decrement when
         | the leap second happens.
        
           | morelisp wrote:
           | If you're using epoch timestamps to represent time you don't
           | have leap second bugs. Each Unix timestamp corresponds to one
           | unique point in time which also has some representation in
           | civil time.
           | 
           | If you're subtracting Unix timestamps to get durations you
           | may have bugs, but those bugs might align better to your
           | user's expectations anyway.
        
             | wongarsu wrote:
             | > Each Unix timestamp corresponds to one unique point in
             | time
             | 
             | That depends. 30th June 2015 23:59:59 has the timestamp
             | 1435708799. 1st July 2015 00:00:00 has the timestamp
             | 1435708800. But what about the leap second that happened
             | between those two seconds? I think most people would claim
             | it also has timestamp 1435708799. but then "Each Unix
             | timestamp corresponds to one unique point in time" isn't
             | true, because 1435708799 now corresponds to two different
             | points in time (30th June 2015 23:59:59 and 30th June 2015
             | 23:59:60). Saying there is a unique mapping from unix
             | timestamp to time would only be true if you either say that
             | 30th June 2015 23:59:60 has no unix timestamp (which isn't
             | a valid return value of any time API I've seen) or if you
             | smear time.
        
               | jodrellblank wrote:
               | It still wouldn't be true because there are points in
               | time smaller than a second.
        
               | morelisp wrote:
               | No, I didn't say it was a bijective mapping. (Although
               | with smearing, it still can be.)
        
               | bobbylarrybobby wrote:
               | Well, all time stamps are rounded or floored to the
               | nearest second (or millisecond, or micro, or...) because
               | you only have a finite amount of space to store them --
               | you always have infinitely many time stamps mapping to
               | each Unix second.
               | 
               | This means you can just adjust the mapping around the
               | leap second. A naive implementation might have those two
               | real-life seconds tick twice as fast to take only one
               | Unix second. Or you could have the whole day's seconds
               | tick 86401/86400 times as fast, etc. In terms of the
               | mapping from continuous time to discrete time, this is
               | just as fine as the non-leap-second case.
        
               | morelisp wrote:
               | > 30th June 2015 23:59:60 has no unix timestamp (which
               | isn't a valid return value of any time API I've seen)
               | 
               | https://go.dev/play/p/I8GcKBoAVXi
        
             | zokier wrote:
             | > Each Unix timestamp corresponds to one unique point in
             | time which also has some representation in civil time
             | 
             | This is plain false. For example 1435708799 corresponds to
             | both 2015-06-30T23:59:59Z and 2015-06-30T23:59:60Z
        
               | morelisp wrote:
               | No, your misdesigned ntpd resets your clock to 1435708799
               | at 2015-06-30T23:59:60Z. I have yet to see a convincing
               | argument that smearing and freezing are not compatible
               | with POSIX's definitions of time.
        
               | zokier wrote:
               | its kernel that does the leap second handling, ntpd just
               | informs it of upcoming leaps
               | 
               | https://elixir.bootlin.com/linux/v6.4.1/source/kernel/tim
               | e/n...
               | 
               | sure leap smearing could be considered compatible with
               | posix in the sense that posix does afaik not require any
               | level of accuracy from clocks, so even something silly
               | like xkcd 2266 could be considered posix compatible
               | https://xkcd.com/2266/
        
             | dataflow wrote:
             | No, bugs can come up without any subtraction. If an event
             | was recorded as happening at 23:59:60.2, you need it to be
             | sequenced strictly after 23:59:59.9. Winding back the clock
             | by 1 second makes it so you end up with the later event as
             | having occurred at 23:59:59.2, violating causality. Even
             | without sub-second precision the situation is still buggy,
             | just not as bad - you end up with two simultaneous events
             | when there weren't any in the original representation,
             | which is itself quite surprising when the two look like
             | they have the same precision.
        
               | heywhatupboys wrote:
               | > you end up with the later event as having occurred at
               | 23:59:59.1
               | 
               | Unix time is seconds since 1970... If you convert between
               | time formats you MAY loose information. That has nothing
               | to do with unix time.
        
               | dataflow wrote:
               | "Lose information" is not a binary thing, nor is it
               | equivalent to introducing _wrong_ information. You don 't
               | expect (short)INT_MAX to suddenly give you 1337 just
               | because it "loses information". You still want to have
               | (and do have) some useful properties & information to be
               | preserved even after certain information loss, otherwise
               | the implementation is buggy despite the information loss.
               | 
               | In this case, without leap seconds, you'd have still lost
               | the _precision_ , but you wouldn't have actually violated
               | causality. Winding back during a leap second makes it so
               | that a timestamp that's > 1 second later in real time
               | still ends up being ordered as strictly before the
               | earlier time. That's a bug, regardless of how you slice
               | it.
               | 
               | Now this can be partially avoided by saturating the
               | subtraction at the boundary rather than winding it back,
               | and I'm not sure how many implementations do this. That
               | seems better, but it's still surprising (and thus bug-
               | prone) given the input and output representation
               | otherwise appear to have sufficient precision.
        
               | zokier wrote:
               | > Unix time is seconds since 1970...
               | 
               | This is unfortunately common misconception. Unix time in
               | reality is seconds since epoch _minus the number of leap
               | seconds_
        
         | notfed wrote:
         | Can you elaborate? How does this lead to a leap second bug?
         | Would a string really say "60" as the number of seconds,
         | somehow, due to leap seconds?
        
           | layer8 wrote:
           | Yes: http://www.niceties.com/leapsec.html
           | 
           | See also: http://leapsecond.com/notes/leap-watch.htm
        
         | paulddraper wrote:
         | > we want to convert them to an integer presenting the number
         | of seconds since the Unix epoch. The Unix epoch is January 1st
         | 1970.
         | 
         | UNIX time ignores leap seconds.
         | 
         | I.e. it's the number of seconds ignoring leap seconds since
         | 1970.
        
           | dataflow wrote:
           | > UNIX time ignores leap seconds.
           | 
           | Side note but I always found this phrasing to be incredibly
           | confusing (hence the need for the "i.e."...); it's the exact
           | opposite of what I would think it means to "ignore" leap
           | seconds. Ignoring something means you don't change your
           | behavior when that something happens. But when leap seconds
           | happen, the epoch timestamp _halts_ its counting. That 's not
           | ignoring leap seconds -- that's actively countering the leap
           | seconds!
        
             | Groxx wrote:
             | yeah - "ignoring" to me means "is ignorant of" means "one
             | second passing means increasing the time by one second".
             | You do by far the most obvious thing when time passes, and
             | you ignore the complexities of reality to continue being
             | maximally obvious.
             | 
             | Doing anything else means _handling_ them by adding custom
             | behavior.
        
             | paulddraper wrote:
             | Ignores = does nothing
             | 
             | Normally, UNIX time goes up by one each second.
             | 
             | Unless, it's a leap second, in which case UNIX time does
             | nothing.
        
               | dataflow wrote:
               | Halting the advancement that would've ordinarily happened
               | is not "ignoring it", it's doing something to actively
               | counter it. That "something" ends up canceling the
               | previous trajectory to evaluate to "nothing", but that
               | doesn't happen through ignorance - it required a
               | countering action.
               | 
               | Example: You're driving your car when pedestrians appear
               | in front of you. What does "ignoring" them constitute?
               | Stopping the car, or running into them at the previous
               | speed?
        
               | agons wrote:
               | I'd say that the people counting leap seconds are the
               | ones "doing something".
               | 
               | At the point they needed to be inserted, Unix systems
               | carry on counting seconds as units of time as they pass,
               | "ignoring" the need to add 1 for the purposes of
               | accounting.
        
               | zokier wrote:
               | > At the point they needed to be inserted, Unix systems
               | carry on counting seconds as units of time as they pass
               | 
               | unix systems essentially will roll back clocks during
               | leap seconds. here is handy dandy table as an example:
               | Time                         tv_sec         tv_usec
               | 2015-06-30T23:59:58.9Z       1435708798     900000
               | 2015-06-30T23:59:59.0Z       1435708799          0
               | 2015-06-30T23:59:59.1Z       1435708799     100000
               | ...                          ...            ...
               | 2015-06-30T23:59:59.9Z       1435708799     900000
               | 2015-06-30T23:59:60.0Z       1435708799          0
               | 2015-06-30T23:59:60.1Z       1435708799     100000
               | ...                          ...            ...
               | 2015-06-30T23:59:60.9Z       1435708799     900000
               | 2015-07-01T00:00:00.0Z       1435708800          0
               | 2015-07-01T00:00:00.1Z       1435708800     100000
               | ...                          ...            ...
               | 
               | I don't know about you but I wouldn't call that "counting
               | seconds as units of time as they pass"
        
               | paulddraper wrote:
               | I suppose you can envision UNIX clock with
               | "inertia"/Newton's second law.
               | 
               | But when I count people passing me, I press my clicker
               | each time to count them. Except once in a while I don't
               | do anything, i.e. ignore them.
        
               | catiopatio wrote:
               | Not doing anything is a decision to depart from your
               | standard behavior. It's not "ignoring".
        
           | zokier wrote:
           | Yes, that means that :60 should parse into same value as :59.
           | My reading of the code is that now it might end up in the
           | `return false` branch instead, although I'm not exactly sure
           | how the limits thing is used here
        
           | MobiusHorizons wrote:
           | Sure, but he's not parsing Unix timestamps. He's parsing time
           | in the format %Y%m%d%H%M%S, which would have leap seconds on
           | most systems
        
             | paulddraper wrote:
             | He's parsing them into UNIX timestamps.
             | 
             | And apparently these strings don't have second 60.
             | 
             | Works for me.
        
         | worewood wrote:
         | Reminds me of that post about things programmers always get
         | wrong about dates [1].
         | 
         | [1]: https://news.ycombinator.com/item?id=4128208
        
       | 0xr0kk3r wrote:
       | I love how bad the comments are for this code. You'd think a
       | professor would want to encourage better documentation, or at
       | least expose students who are reading this post to some basic
       | commenting. Especially because this is a really great code
       | example of how to twiddle packed values of different sizes and
       | ranges simultaneously, and not the standard SIMD examples of
       | matrix math that students usually see first.
        
         | dundarious wrote:
         | Looks to be using a lot of similar ideas to the ones explained
         | well in these articles:
         | 
         | https://kholdstare.github.io/technical/2020/05/26/faster-int...
         | 
         | http://0x80.pl/articles/simd-parsing-int-sequences.html
        
         | jakear wrote:
         | What are you even talking about? There's tons of comments and
         | the code isn't obscure. https://github.dev/lemire/Code-used-on-
         | Daniel-Lemire-s-blog/...
        
           | 0xr0kk3r wrote:
           | What do you think I'm talking about? OP pruned the comments
           | out of github and put the code on the page w/o them. I didn't
           | click to the github page. Still confused?
        
         | DonPellegrino wrote:
         | It's not bad at all https://github.com/lemire/Code-used-on-
         | Daniel-Lemire-s-blog/...
        
           | 0xr0kk3r wrote:
           | Oh, I didn't follow that link, I was just reading the blog.
           | Those are decent comments, too bad they were nixed for the
           | blog.
        
             | morelisp wrote:
             | But the entire paragraph where they are introduced is a
             | commentary on them.
        
               | 0xr0kk3r wrote:
               | Yes, I read that. It is in-prose form and detached from
               | the comments. Can't believe I'm arguing about this, but
               | comments need to be next to the code (like the github).
               | Thanks for mansplaining tho.
        
         | brigadier132 wrote:
         | Code from academia is notoriously unreadable
        
         | WJW wrote:
         | > You'd think a professor would want to encourage better
         | documentation, or at least expose students who are reading this
         | post to some basic commenting.
         | 
         | Welcome to academic code. We're lucky there's any comments at
         | all. :)
        
       | dmw_ng wrote:
       | Neatly omits the massively expensive calendar conversion step for
       | turning the timestamp into a useful format for fast arithmetic.
       | Parsing integers from strings is not why parsing timestamps is
       | slow, any more than counting the length of strings is why XML
       | parsing might be slow.
       | 
       | For time-sequential data, one easy speedup hack is caching the
       | UNIX epoch for the current day start and adding the HH:MM:SS
       | manually. No measurements to offer, but would suggest approaches
       | like this would cleanly beat simdification of a trivial part of a
       | complex task
        
         | morelisp wrote:
         | > omits the massively expensive calendar conversion step
         | 
         | Is that not included in what's 6x faster? I agree comparing
         | against strptime is a bit of a gimme and a scalar fixed-format
         | parser would be fairer, but I'm pretty sure it will beat that
         | too.
         | 
         |  _Edit to reply since rate-limited:_
         | 
         | > No mention of calendar in either place. Did you see something
         | in the article or code, or are you just asking us to check?
         | 
         | The (lack so far of) calendar is mentioned in the blog post -
         | "It does not get all the parsing done... We can just use
         | standard C code for the result." - and is _very obviously_ in
         | the GitHub code.
         | 
         | https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/...
        
           | refulgentis wrote:
           | After reading twice and scouring the code on GitHub, I don't
           | think so.
           | 
           | No mention of calendar in either place.
           | 
           | Did you see something in the article or code, or are you just
           | asking us to check?
        
         | anarazel wrote:
         | FWIW, the last time I measured the string handling part was far
         | slower than the timezone conversion when loading timestamptz
         | data in postgres.
        
         | wood_spirit wrote:
         | Indeed! It was a while ago but Im quite proud of the fast
         | timestamp functions I made for Java after the standard library
         | timestamp performance turned out to be critical to some data
         | pipelines I was working on.
         | 
         | The early versions of the code did cache the last parsed date
         | etc but as the parser got quicker this became diminishing
         | returns.
         | 
         | The fastest way I found for the calendar month numbers etc was
         | to use a precomputed table etc.
         | 
         | https://github.com/williame/TimeMillis
        
           | lichtenberger wrote:
           | Wouldn't it be possible to contribute the faster algorithm to
           | the JDK (e.g. the Instant class)? Maybe you'd need some
           | iterations, but do you see any downsides?
        
         | dataflow wrote:
         | > one easy speedup hack is caching the UNIX epoch for the
         | current day start and adding the HH:MM:SS manually
         | 
         | Side note, this is a really great example of why you'd want an
         | operation such as parsing to be state-mutating when you'd
         | normally expect it not to. i.e. why it makes sense to support
         | passing a Parser object rather than just a parse() function,
         | even for operations where you don't think it's necessary at
         | first glance.
        
           | DrJokepu wrote:
           | the downside is that now you will have to consider thread
           | safety
        
             | dataflow wrote:
             | Yep, you'd want to copy the parser per thread.
        
         | haberman wrote:
         | Do you mean the conversion from YYYY-MM-DD to Unix Time? That
         | can be quite fast; I wrote a blog article showing some
         | algorithms that take less than 10ns to perform the conversion:
         | https://blog.reverberate.org/2020/05/12/optimizing-date-algo...
         | 
         | Recently I came across another variant of the algorithm that is
         | even smaller and simpler. I've been meaning to test this
         | variant and add it to my article (assuming it is correct), but
         | I haven't had the chance:
         | https://dotat.at/@/2008-09-10-counting-the-days.html. (EDIT: I
         | just tried it out. It works, and while the source code is
         | smaller than my other variants, epoch_days_fast() from my
         | article is still smaller and faster in terms of machine code:
         | https://godbolt.org/z/Kjnafzqcx vs
         | https://godbolt.org/z/e7nchjKGW)
        
           | dmw_ng wrote:
           | Amazing work :) Do you know if glibc gmtime_r() might be
           | impacted by the active timezone? My "massively expensive"
           | experience above came from gmtime_r() on Amazon Linux running
           | in Lambda, but I can't remember the exact scenario involved
           | (or e.g. if TZ were set), it certainly seems unlikely worth
           | putting the effort into it that I did if the costs were
           | similar to those observed in your post.
        
             | haberman wrote:
             | Thanks! I didn't experiment with gmtime_r() since it goes
             | in the other direction (UnixTime -> civil time). I wouldn't
             | expect timegm() or gmtime_r() to depend on current
             | timezone, since both are hard-coded to use UTC. I never did
             | figure out why the macOS version is performing mutex locks
             | on global data.
        
       | kolbe wrote:
       | Interesting. I think you'd find that doing this in chunks of
       | 128-bit wide vectors with platform-agnostic logic would be even
       | better. Have you tried something like Google Highway or writing
       | this exact logic in scalar code and seeing if the compiler will
       | autovectorize it? Then you have the option to do up to 4 of them
       | at once instead of one if your platform supports it.
        
       | cratermoon wrote:
       | At one consulting gig I had at an airline, I noted that
       | timestamps were everywhere and among the most frequent operations
       | in the system was comparison with current time. The legacy
       | systems spit out date/time in at least 3 different formats, and
       | almost never included the timezone. The TZ had to be determined
       | by looking up the airport's TZ.
       | 
       | When I got there time comparisons were a mess, parsing in
       | multiple places, some comparisons not taking into account the TZ,
       | some add and subtract operations incorrectly code, and other
       | issues.
       | 
       | I tried to make the case that parsing and compare timestamps was
       | a core operation, and the system should have all that
       | functionality polished and packaged up to its own module so
       | programmers could focus on the rest of the system, but was
       | unsuccessful. We were writing in Go, so the language libraries
       | did about 80% of the work, and the less experienced programmers
       | thought it was sufficient.
       | 
       | When I left they were struggling with issues around determining
       | passenger age at the time of flight and getting off-by-one errors
       | around certain cutoff ages.
       | 
       | Time is Hard.
       | https://gist.github.com/timvisee/fcda9bbdff88d45cc9061606b4b...
        
       ___________________________________________________________________
       (page generated 2023-07-02 23:00 UTC)