https://lemire.me/blog/2023/07/01/parsing-time-stamps-faster-with-simd-instructions/ Skip to content Daniel Lemire's blog Daniel Lemire is a computer science professor at the Data Science Laboratory of the Universite du Quebec (TELUQ) in Montreal. His research is focused on software performance and data engineering. He is a techno-optimist and a free-speech advocate. Menu and widgets * My home page * My papers * My software Join over 12,500 email subscribers: [ ][Go!] You can follow this blog on telegram. You can find me on twitter as @lemire or on Mastodon. Search for: [ ] [Search] Support my work! I do not accept any advertisement. However, you can you can sponsor my open-source work on GitHub. Recent Posts * Parsing time stamps faster with SIMD instructions * Dynamic bit shuffle using AVX-512 * Science and Technology links (June 25 2023) * Citogenesis in science and the importance of real problems * Science and Technology links (June 11 2023) Recent Comments * aqrit on Parsing time stamps faster with SIMD instructions * Daniel on Parsing time stamps faster with SIMD instructions * Sergey Bronnikov on Parsing time stamps faster with SIMD instructions * Davidmh on Parsing time stamps faster with SIMD instructions * Arthur Chance on Parsing time stamps faster with SIMD instructions Pages * A short history of technology * About me * Book recommendations * Cognitive biases * Interviews and talks * My bets * My favorite articles * My favorite quotes * My readers * My rules * Newsletter * Predictions * Privacy Policy * Recommended video games * Terms of use * Write good papers Archives Archives [Select Month ] Boring stuff * Log in * Entries feed * Comments feed * WordPress.org Parsing time stamps faster with SIMD instructions In software, it is common to represent time as a time-stamp string. It is usually specified by a time format string. Some standards use the format %Y%m%d%H%M%S meaning that we print the year, the month, the day, the hours, the minutes and the seconds. The current time as I write this blog post would be 20230701205436 as a time stamp in this format. It is convenient because it is short, easy to read and if you sort the strings lexicographically, you also sort them chronologically. You can generate time stamps using any programming language. In C, the following program will print the current time (universal, not local time): #include #include int main() { char buffer[15]; struct tm timeinfo; time_t rawtime; time(&rawtime); gmtime_r(&rawtime, &timeinfo); size_t len = strftime(buffer, 15, "%Y%m%d%H%M%S", &timeinfo); buffer[14] = '\0'; puts(buffer); } We are interested in the problem of parsing these strings. In practice, this means that we want to convert them to an integer presenting the number of seconds since the Unix epoch. The Unix epoch is January 1st 1970. For my purposes, I will consider the time to be an unsigned 32-bit integer so we can represent time between 1970 and 2106. It is not difficult to switch over to a 64-bit integer or to signed integers. The way you typically solve this problem is to use something like the C function strptime. Can we do better? Modern processors have fast instructions that operate on several words at once, called SIMD instructions. We have a block of 14 characters. Let us assume that we can read 16 characters safely, ignoring the content of the leftover characters. We load the block of digits in a SIMD register. We subtract 0x30 (the code point value of the character '0'), and all bytes values should be between 0 and 9, inclusively. We know that some character must be smaller than 9, for example, we generally cannot have more than 59 seconds and never 60 seconds, in the time stamp string. So one character must be between 0 and 5. Similarly, we start the hours at 00 and end at 23, so one character must be between 0 and 2. We do a saturating subtraction of the maximum: the result of such a subtraction should be zero if the value is no larger. We then use a special instruction to multiply one byte by 10, and sum it up with the next byte, getting a 16-bit value. We then repeat the same approach as before, checking that the result is not too large. The code might look as follow using Intel intrinsic functions: __m128i v = _mm_loadu_si128((const __m128i *)date_string); v = _mm_sub_epi8(v, _mm_set1_epi8(0x30)); __m128i limit = _mm_setr_epi8(9, 9, 9, 9, 1, 9, 3, 9, 2, 9, 5, 9, 5, 9, -1, -1); __m128i abide_by_limits = _mm_subs_epu8(v, limit); // must be all zero const __m128i weights = _mm_setr_epi8( 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 0, 0); v = _mm_maddubs_epi16(v, weights); __m128i limit16 = _mm_setr_epi16(99,99, 12, 31, 23, 59, 59, -1); __m128i abide_by_limits16 = _mm_subs_epu16(v, limit16); __m128i limits = _mm_or_si128(abide_by_limits16,abide_by_limits); if (!_mm_test_all_zeros(limits, limits)) { return false; } It does not get all the parsing done, but at this point, you have the months, days, hours, minutes and seconds as valid binary integer values. The year is parsed in two components (the first two digits, and the next two digits). We can just use standard C code for the result. Is it fast? I wrote a benchmark that I compile using GCC 12 on an Intel Ice Lake Linux server. instructions per stamp time per stamp standard C with strptime 700 46 SIMD approach 65 7.9 We use about 10 times fewer instructions, and we go 6 times faster. That is not bad, but I suspect it is not nearly optimal. The source code is available. Credit: Thanks to Jeroen Koekkoek from NLnetLabs for initial work and for proposing the problem, and to @aqrit for sketching the current code. Published by [2ca999] Daniel Lemire A computer science professor at the University of Quebec (TELUQ). View all posts by Daniel Lemire Posted on July 1, 2023July 2, 2023Author Daniel LemireCategories 7 thoughts on "Parsing time stamps faster with SIMD instructions" 1. [e03257] -.- says: July 2, 2023 at 1:29 am The SSE code looks good. Possible tweaks (probably won't show any difference in benchmarks, but may have theoretical advantages): replace _mm_sub_epi8 with _mm_xor_si128 - latter is commutative, so may give compiler more freedom with ordering (e.g. use load-op on the source if the 0x30 vector is already in a register). Some CPUs may have more ports for bitwise ops over arithmetic maddubs has longish latency; the _mm_subs_epu16 could actually be done before it if you use a BCD-like representation for limit16 which might help ILP Reply 2. [a133ac] Duck says: July 2, 2023 at 7:32 am I solved a very similar problem on Stackoverflow some time ago. Its speed was < 1ns per time stamp. https://stackoverflow.com/questions/75680256/ most-insanely-fast-way-to-convert-yymmdd-hhmmss-timestamp-to-uint64-t-number Reply 3. [07e182] Arthur Chance says: July 2, 2023 at 1:35 pm we cannot have more than 59 seconds and never 60 seconds A slight nitpick: technically the seconds field can be '60' at 23:59 on the 30th of June or 31st of December if there's a positive leap second. Reply 1. [2f4c56] aqrit says: July 2, 2023 at 9:01 pm According to my reading of the linked RFC 4034 draft: A value of 60+ seconds is explicitly forbidden. The field being serialized is a Unix Timestamp. Unix Time explicitly ignores leap seconds. Converting Unix Time to UTC will never yield a leap second, as it can not be represented. Reply 4. [18a30c] Davidmh says: July 2, 2023 at 1:57 pm As Arthur indicates, you are ignoring all the leap seconds, which explains at least some of the reasons why the library version is slower. Not enough to get to 10 times more instructions, but still not a fair comparison. Reply 5. [fc8788] Sergey Bronnikov says: July 2, 2023 at 4:30 pm SQL standard allows 62 seconds (0-61) in a minute, see https:// twitter.com/noop_noob/status/1166982640118845442 Reply 6. [0b6cf4] Daniel says: July 2, 2023 at 7:56 pm Just wanted to say hello and I'd wish I understood half of the code shared Reply Leave a Reply Cancel reply Your email address will not be published. To create code blocks or other preformatted text, indent by four spaces: This will be displayed in a monospaced font. The first four spaces will be stripped off, but all other whitespace will be preserved. Markdown is turned off in code blocks: [This is not a link](http://example.com) To create not a block, but an inline code span, use backticks: Here is some inline `code`. For more help see http://daringfireball.net/projects/markdown/syntax [ ] [ ] [ ] [ ] [ ] [ ] [ ] Comment * [ ] Name * [ ] Email * [ ] Website [ ] [ ] Save my name, email, and website in this browser for the next time I comment. Receive Email Notifications? [no, do not subscribe ] [instantly ] Or, you can subscribe without commenting. [Post Comment] [ ] [ ] [ ] [ ] [ ] [ ] [ ] D[ ] You may subscribe to this blog by email. Post navigation Previous Previous post: Dynamic bit shuffle using AVX-512 Terms of use Proudly powered by WordPress