https://lemire.me/blog/2022/11/25/making-all-your-integers-positive-with-zigzag-encoding/

Skip to content

Daniel Lemire's blog

Daniel Lemire is a computer science professor at the University of
Quebec (TELUQ) in Montreal. His research is focused on software
performance and data engineering. He is a techno-optimist and a
free-speech advocate.

Menu and widgets

  * My home page
  * My papers
  * My software

Subscribe

Join 12,500 subscribers:
Email Address [                    ]
[                    ]
[Subscribe by email]
You can also follow this blog on telegram.
Search for: [                    ] [Search]
Support my work!

I do not accept any advertisement. However, you can you can sponsor
my open-source work on GitHub.

Recent Posts

  * Making all your integers positive with zigzag encoding
  * What is the size of a byte[] array in Java?
  * Rounding modes: std::from_chars versus strtod/strtof
  * A fast function to check your floating-point rounding mode
  * Measuring the memory usage of your C++ program

Recent Comments

  * harold on Making all your integers positive with zigzag encoding
  * Daniel Lemire on Making all your integers positive with zigzag
    encoding
  * Frederico L. Pissarra on Making all your integers positive with
    zigzag encoding
  * dirtysalt on Data structure size and cache-line accesses
  * Daniel Lemire on Quickly identifying a sequence of digits in a
    string of characters

Pages

  * A short history of technology
  * About me
  * Book recommendations
  * Cognitive biases
  * Interviews and talks
  * My bets
  * My favorite articles
  * My favorite quotes
  * My readers
  * My sayings
  * Predictions
  * Recommended video games
  * Terms of use
  * Write good papers

Archives

Archives [Select Month        ]

Boring stuff

  * Log in
  * Entries feed
  * Comments feed
  * WordPress.org

Making all your integers positive with zigzag encoding

You sometimes feel the need to make all of your integers positive,
without losing any information. That is, you want to map all of your
integers from 'signed' integers (e.g., -1, 1, 3, -3) to 'unsigned
integers' (e.g., 3,2,6,7). This could be useful if you have a fast
function to compress integers that fails to work well for negative
integers.

Many programming languages (Go, C, etc.) allow you just 'cast' the
integer. For example, the following Go code will print out -1,
18446744073709551615, -1 under most systems:

        var x = -1
        var y = uint(x)
        var z = int(y)
        fmt.Println(x, y, z)

That is, you can take a small negative value, interpret it as a large
integer, and then 'recover' back your small value.

What if you want to have that small values remain small ? Then  a
standard approach is to use zigzag encoding. The recipe is as
follows:

  * Compute twice the absolute value of your integer.
  * Subtract 1 to the result when the original integer was negative.

Effectively, what you are doing is that all positive integers become
even integers (exactly twice as big), and all negative integers
become odd integers. We interleave negative and positive integers
(odd and even).

original value zigzag value
-20            39
-19            37
-18            35
-17            33
-16            31
-15            29
-14            27
-13            25
-12            23
-11            21
-10            19
-9             17
-8             15
-7             13
-6             11
-5             9
-4             7
-3             5
-2             3
-1             1
0              0
1              2
2              4
3              6
4              8
5              10
6              12
7              14
8              16
9              18
10             20
11             22
12             24
13             26
14             28
15             30
16             32
17             34
18             36
19             38
20             40

In Python, you might implement the encoding and the decoding as
follows:

def zigzag_encode(val) :
    if val < 0:
        return - 2 * val  - 1
    return 2 * val

def zigzag_decode(val) :
    if val & 1 == 1 :
        return - val // 2
    return val // 2

The same code in C/C++ might work, but it could be more efficient to
use optimized code which assumes that the underlying hardware
represents signed integers with two's complement encoding (which is a
safe assumption in 2022 and a requirement in C++20) and that bytes
span 8 bits (another safe assumption)...

int fast_decode(unsigned int x) {
    return (x >> 1) ^ (-(x&1));
}

unsigned int fast_encode(int x) {
    return (2*x) ^ (x >>(sizeof(int) * 8 - 1));
}

Much the same code will work in Go, Rust, Swift, etc.

Published by

[2ca999]

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).
View all posts by Daniel Lemire

Posted on November 25, 2022November 25, 2022Author Daniel Lemire
Categories  

3 thoughts on "Making all your integers positive with zigzag
encoding"

 1. [75daf5] Frederico L. Pissarra says:
    November 25, 2022 at 5:36 pm

    Beware that, in C, right shift with signed integers is a undef.
    behavior.

    Reply
     1. [2ca999] Daniel Lemire says:
        November 25, 2022 at 5:47 pm

        Up until C23 it was implementation defined. I expect that C23
        will fix this particular issue.

        In Go or Java, there is no issue.

        Reply
     2. [182d0c] harold says:
        November 25, 2022 at 6:22 pm

        Right shift of signed integers was not UB, the spec text was
        "If E1 has a signed type and a negative value, the resulting
        value is implementation-defined"

        Reply

Leave a Reply Cancel reply

Your email address will not be published. The comment form expects
plain text. If you need to format your text, you can use HTML
elements such strong, blockquote, cite, code and em. For formatting
code as HTML automatically, I recommend tohtml.com.

          [                                             ]
          [                                             ]
          [                                             ]
          [                                             ]
          [                                             ]
          [                                             ]
          [                                             ]
Comment * [                                             ]

Name * [                              ]

Email * [                              ]

Website [                              ]

[ ] Save my name, email, and website in this browser for the next
time I comment.

Receive Email Notifications? [no, do not subscribe      ] 
[instantly    ]
Or, you can subscribe without commenting.

[Post Comment] 

 [                                             ] 
 [                                             ] 
 [                                             ] 
 [                                             ] 
 [                                             ] 
 [                                             ] 
 [                                             ] 
D[                                             ] 

You may subscribe to this blog by email.

Post navigation

Previous Previous post: What is the size of a byte[] array in Java?
Terms of use Proudly powered by WordPress