https://lemire.me/blog/2021/06/03/computing-the-number-of-digits-of-an-integer-even-faster/

Skip to content

Daniel Lemire's blog

Daniel Lemire is a computer science professor at the University of
Quebec (TELUQ) in Montreal. His research is focused on software
performance and data engineering. He is a techno-optimist.

Menu and widgets

  * My home page
  * My papers
  * My software

Subscribe

Email Address [                    ]
[                    ]
[Subscribe by email]
Where to find me?

I am on Twitter and GitHub:

Follow @lemire
You can also find Daniel Lemire on

  * on Google Scholar with 4k citations and over 75 peer-reviewed
    publications,
  * on Facebook,
  * and on LinkedIn.

Before the pandemic of 2020, you could meet Daniel in person, as he
was organizing regular talks open to the public in Montreal: tribalab
and technolab .
Search for: [                    ] [Search]
Support my work!

I do not accept any advertisement. However, you can support the blog
with donations through paypal. Please consider getting in touch if
you are a supporter so that I can thank you.

Recent Posts

  * Computing the number of digits of an integer even faster
  * Computing the number of digits of an integer quickly
  * All models are wrong
  * Science and Technology links (May 22nd 2021)
  * Counting the number of matching characters in two ASCII strings

Recent Comments

  * jk-jeon on Computing the number of digits of an integer even
    faster
  * Daniel Lemire on Computing the number of digits of an integer
    even faster
  * jk-jeon on Computing the number of digits of an integer even
    faster
  * Daniel Lemire on Converting floating-point numbers to integers
    while preserving order
  * George Spelvin on Computing the number of digits of an integer
    even faster

Pages

  * A short history of technology
  * About me
  * Book recommendations
  * Cognitive biases
  * Interviews and talks
  * My bets
  * My favorite articles
  * My favorite quotes
  * My readers
  * My sayings
  * Predictions
  * Recommended video games
  * Terms of use
  * Write good papers

Archives

Archives [Select Month        ]

Boring stuff

  * Log in
  * Entries feed
  * Comments feed
  * WordPress.org

Computing the number of digits of an integer even faster

I my previous blog post, I documented how one might proceed to
compute the number of digits of an integer quickly. E.g., given the
integer 999, you want 3 but given the integer 1000, you want 4. It is
effectively the integer logarithm in base 10.

On computers, you can quickly compute the integer logarithm in base
2, and it follows that you can move from one to the other rather
quickly. You just need a correction which you can implement with a
table. A very good solution found in references such as Hacker's
Delight is as follows:

    static uint32_t table[] = {9, 99, 999, 9999, 99999,
    999999, 9999999, 99999999, 999999999};
    int y = (9 * int_log2(x)) >> 5;
    y += x > table[y];
    return y + 1;

Except for the computation of the integer logarithm, it involves a
multiplication by 9, a shift, a conditional move, a table lookup and
an increment. Can you do even better? You might! Kendall Willets
found an even more economical solution.

int fast_digit_count(uint32_t x) {
  static uint64_t table[] = {
      4294967296,  8589934582,  8589934582,  8589934582,  12884901788,
      12884901788, 12884901788, 17179868184, 17179868184, 17179868184,
      21474826480, 21474826480, 21474826480, 21474826480, 25769703776,
      25769703776, 25769703776, 30063771072, 30063771072, 30063771072,
      34349738368, 34349738368, 34349738368, 34349738368, 38554705664,
      38554705664, 38554705664, 41949672960, 41949672960, 41949672960,
      42949672960, 42949672960};
  return (x + table[int_log2(x)]) >> 32;
}

If I omit the computation of the integer logarithm in base 2, it
requires just a table lookup, an addition and a shift:

add     rax, qword ptr [8*rcx + table]
shr     rax, 32

The table contains the numbers ceil(log10(2^j)) * 2^32 + 2^32 - 10^
ceil(log10(2^j)) for j from 2 to 30, and then just ceil(log10(2^j))
for j = 31 and j = 32. The first value is 2^32 .

My implementation of Kendall's solution is available.

Using modern C++, you can compute the table using constant
expressions.

Further reading: Josh Bleecher Snyder has a blog post on this topic
which tells the whole story.

Published by

[4b7361]

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).
View all posts by Daniel Lemire

Posted on June 3, 2021June 4, 2021Author Daniel LemireCategories  

8 thoughts on "Computing the number of digits of an integer even
faster"

 1. [331059] KWillets says:
    June 3, 2021 at 9:02 pm

    Let me give Josh Bleecher Snyder credit for the idea that
    int_log2 splits the input into ranges where log10 can only go up
    by 1 at most. Within each such interval the problem is reduced to
    comparing against a table-derived threshold and incrementing if
    greater.

    Reply
 2. [134b8d] Josh Bleecher Snyder says:
    June 4, 2021 at 1:37 am

    Very nice! Three cheers for collaboration. 

    Reply
 3. [f24a34] Thomas Muller Graf says:
    June 4, 2021 at 7:05 am

    I wonder if this very nice trick could be used for other
    problems, e.g. fast square root, specially the integer square
    root.

    Reply
 4. [b02b01] Tushar Bali says:
    June 4, 2021 at 11:02 am

    exciting stuff

    Reply
 5. [293aad] George Spelvin says:
    June 4, 2021 at 11:49 am

    I still prefer the first solution because it extends to 64 bits
    naturally. The second solution can be considered a variant of the
    first where:
    * The high half of the 9*bits/32 approximation is replaced by a
    table lookup,
    * The threshold table lookup is indexed by number of bits, not
    number of digits (making it larger by a factor of log2(10) =
    3.32), and
    * The threshold comparison is done in the (semantically
    equivalent) form of an add with carry to the low half of the
    table.

    Bloating the table by a factor or 6.64 (for 32 bits, from 114 =
    44 bytes to 328 = 256 bytes, or +3 cache lines) doesn't seem like
    a good use of L1 cache, especially as any digit-counting function
    is going to be part of some much larger output formatting code
    and not an inner loop by itself.

    There's one more trick that hasn't shown up yet and might be
    useful. Given a safe underestimate such as
    y = 9*int_log2(x) / 32;
    then x >> y can be used to compute the number of digits without
    losing accuracy; the low digit_count(x) bits of x don't affect
    the result (because 10 is a multiple of 2).

    This might enable people to fit the necessary low and high parts
    of each table entry into a single word.

    Reply
     1. [407f36] jk-jeon says:
        June 4, 2021 at 9:53 pm

        Okay, I actually tried George Spelvin's idea and got this:
        https://godbolt.org/z/3h971c8K8.
        So difference between two implementations are that for the
        "fused" table lookup, we have

        shrx eax, edi, eax
        add eax, DWORD PTR table[0+rdx*4]
        shr eax, 26

        and for the normal method explained in Prof. Lemire's
        previous blog post, we have

        cmp DWORD PTR table[0+rdx*4], edi
        adc eax, 1

        I believe the first one might be very slightly faster as adc
        instruction is usually a bit slower than add. (But really,
        there should be no real difference.) I tested through
        quick-bench.com and it seems that GCC slightly favors the
        first one and Clang slightly favors the second one, but I
        think this is unfair, since quick-bench.com does not let me
        to specify the architecture flag, thus it generates shr eax,
        cl instead of shrx eax, edi, eax, which I believe is usually
        a bit slower.

        Reply
 6. [407f36] jk-jeon says:
    June 4, 2021 at 5:57 pm

    FYI, here is an application of the idea presented in this post
    into the 64-bit case: https://godbolt.org/z/e76eEKKj3

    Reply
     1. [4b7361] Daniel Lemire says:
        June 4, 2021 at 6:07 pm

        #include <cassert>
        #include <cstdint>
        #include <cstddef>

        constexpr int floor_log10_pow2(int e) noexcept {
            return (e * 1262611) >> 22;
        }

        constexpr int ceil_log10_pow2(int e) noexcept {
            return e == 0 ? 0 : floor_log10_pow2(e) + 1;
        }

        struct digit_count_table_holder_t {
          std::uint64_t entry[64];
        };

        constexpr digit_count_table_holder_t generate_digit_count_table() {
            digit_count_table_holder_t table{ {} };
            constexpr std::uint64_t pow10[] = {
                1ull,
                10ull,
                100ull,
                1000ull,
                1'0000ull,
                10'0000ull,
                100'0000ull,
                1000'0000ull,
                1'0000'0000ull,
                10'0000'0000ull,
                100'0000'0000ull,
                1000'0000'0000ull,
                1'0000'0000'0000ull,
                10'0000'0000'0000ull,
                100'0000'0000'0000ull,
                1000'0000'0000'0000ull,
                1'0000'0000'0000'0000ull,
                10'0000'0000'0000'0000ull,
                100'0000'0000'0000'0000ull,
                1000'0000'0000'0000'0000ull
            };

            for (int i = 0; i < 64; ++i) {
                auto const ub = std::uint64_t(ceil_log10_pow2(i));
                assert(ub <= 19);
                table.entry[i] = ((ub + 1) << 52) - (pow10[ub] >> (i / 4));
            }

            return table;
        }

        constexpr inline auto digit_count_table = generate_digit_count_table();

        int floor_log2(std::uint64_t n) noexcept {
            return 63 ^ __builtin_clzll(n);
        }

        int count_digit(std::uint64_t n) noexcept {
            return int((digit_count_table.entry[floor_log2(n)] + (n >> (floor_log2(n) / 4))) >> 52);
        }

        Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked 
*

To create code blocks or other preformatted text, indent by four
spaces:

    This will be displayed in a monospaced font. The first four
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

        [                                             ]
        [                                             ]
        [                                             ]
        [                                             ]
        [                                             ]
        [                                             ]
        [                                             ]
Comment [                                             ]

Name * [                              ]

Email * [                              ]

Website [                              ]

[ ] Save my name, email, and website in this browser for the next
time I comment.

Receive Email Notifications? [no, do not subscribe      ] 
[instantly    ]
Or, you can subscribe without commenting.

[Post Comment] 

[                                             ]
[                                             ]
[                                             ]
[                                             ]
[                                             ]
[                                             ]
[                                             ]
[                                             ]
Post navigation

Previous Previous post: Computing the number of digits of an integer
quickly
Proudly powered by WordPress