https://lemire.me/blog/2022/11/25/making-all-your-integers-positive-with-zigzag-encoding/ Skip to content Daniel Lemire's blog Daniel Lemire is a computer science professor at the University of Quebec (TELUQ) in Montreal. His research is focused on software performance and data engineering. He is a techno-optimist and a free-speech advocate. Menu and widgets * My home page * My papers * My software Subscribe Join 12,500 subscribers: Email Address [ ] [ ] [Subscribe by email] You can also follow this blog on telegram. Search for: [ ] [Search] Support my work! I do not accept any advertisement. However, you can you can sponsor my open-source work on GitHub. Recent Posts * Making all your integers positive with zigzag encoding * What is the size of a byte[] array in Java? * Rounding modes: std::from_chars versus strtod/strtof * A fast function to check your floating-point rounding mode * Measuring the memory usage of your C++ program Recent Comments * harold on Making all your integers positive with zigzag encoding * Daniel Lemire on Making all your integers positive with zigzag encoding * Frederico L. Pissarra on Making all your integers positive with zigzag encoding * dirtysalt on Data structure size and cache-line accesses * Daniel Lemire on Quickly identifying a sequence of digits in a string of characters Pages * A short history of technology * About me * Book recommendations * Cognitive biases * Interviews and talks * My bets * My favorite articles * My favorite quotes * My readers * My sayings * Predictions * Recommended video games * Terms of use * Write good papers Archives Archives [Select Month ] Boring stuff * Log in * Entries feed * Comments feed * WordPress.org Making all your integers positive with zigzag encoding You sometimes feel the need to make all of your integers positive, without losing any information. That is, you want to map all of your integers from 'signed' integers (e.g., -1, 1, 3, -3) to 'unsigned integers' (e.g., 3,2,6,7). This could be useful if you have a fast function to compress integers that fails to work well for negative integers. Many programming languages (Go, C, etc.) allow you just 'cast' the integer. For example, the following Go code will print out -1, 18446744073709551615, -1 under most systems: var x = -1 var y = uint(x) var z = int(y) fmt.Println(x, y, z) That is, you can take a small negative value, interpret it as a large integer, and then 'recover' back your small value. What if you want to have that small values remain small ? Then a standard approach is to use zigzag encoding. The recipe is as follows: * Compute twice the absolute value of your integer. * Subtract 1 to the result when the original integer was negative. Effectively, what you are doing is that all positive integers become even integers (exactly twice as big), and all negative integers become odd integers. We interleave negative and positive integers (odd and even). original value zigzag value -20 39 -19 37 -18 35 -17 33 -16 31 -15 29 -14 27 -13 25 -12 23 -11 21 -10 19 -9 17 -8 15 -7 13 -6 11 -5 9 -4 7 -3 5 -2 3 -1 1 0 0 1 2 2 4 3 6 4 8 5 10 6 12 7 14 8 16 9 18 10 20 11 22 12 24 13 26 14 28 15 30 16 32 17 34 18 36 19 38 20 40 In Python, you might implement the encoding and the decoding as follows: def zigzag_encode(val) : if val < 0: return - 2 * val - 1 return 2 * val def zigzag_decode(val) : if val & 1 == 1 : return - val // 2 return val // 2 The same code in C/C++ might work, but it could be more efficient to use optimized code which assumes that the underlying hardware represents signed integers with two's complement encoding (which is a safe assumption in 2022 and a requirement in C++20) and that bytes span 8 bits (another safe assumption)... int fast_decode(unsigned int x) { return (x >> 1) ^ (-(x&1)); } unsigned int fast_encode(int x) { return (2*x) ^ (x >>(sizeof(int) * 8 - 1)); } Much the same code will work in Go, Rust, Swift, etc. Published by [2ca999] Daniel Lemire A computer science professor at the University of Quebec (TELUQ). View all posts by Daniel Lemire Posted on November 25, 2022November 25, 2022Author Daniel Lemire Categories 3 thoughts on "Making all your integers positive with zigzag encoding" 1. [75daf5] Frederico L. Pissarra says: November 25, 2022 at 5:36 pm Beware that, in C, right shift with signed integers is a undef. behavior. Reply 1. [2ca999] Daniel Lemire says: November 25, 2022 at 5:47 pm Up until C23 it was implementation defined. I expect that C23 will fix this particular issue. In Go or Java, there is no issue. Reply 2. [182d0c] harold says: November 25, 2022 at 6:22 pm Right shift of signed integers was not UB, the spec text was "If E1 has a signed type and a negative value, the resulting value is implementation-defined" Reply Leave a Reply Cancel reply Your email address will not be published. The comment form expects plain text. If you need to format your text, you can use HTML elements such strong, blockquote, cite, code and em. For formatting code as HTML automatically, I recommend tohtml.com. [ ] [ ] [ ] [ ] [ ] [ ] [ ] Comment * [ ] Name * [ ] Email * [ ] Website [ ] [ ] Save my name, email, and website in this browser for the next time I comment. Receive Email Notifications? [no, do not subscribe ] [instantly ] Or, you can subscribe without commenting. [Post Comment] [ ] [ ] [ ] [ ] [ ] [ ] [ ] D[ ] You may subscribe to this blog by email. Post navigation Previous Previous post: What is the size of a byte[] array in Java? Terms of use Proudly powered by WordPress