https://kb.itextpdf.com/itext/how-i-made-pdf-table-rendering-faster

Skip to main content Show navigation

  * [ITSC-Logo-] Go to homepage

  * iText Website
  * Blog
  * Support
  * Community

[                    ]
 1. iText /
 2. Technical Tales

Skip table of contents

How I Made PDF Table Rendering 95% Faster in an Afternoon

Here at Apryse, we occasionally have some free time at the end of our
iText development sprints where we're encouraged to use our
initiative to "work on whatever" we fancy.

During one of these periods, I developed a fascination with the
details of how table rendering works in iText Core; specifically why
large cell counts seemed to slow it down by a lot. So, I decided to
spend some time on improving my understanding in that area, and to
see if I could at least find what was causing it.

[info-macro]

TL;DR: I optimized table rendering in iText with minimal code
changes, by avoiding repeated border collapse calculations and
unnecessary tagging overhead. This significantly improved rendering
performance, with a 50k cell table going from 5 minutes to just 7
secs. Here's how.

A Quick Overview of Tables in iText

Tables are one of the most useful/common layout tools for documents,
but are also quite difficult to implement in PDF because the
specification doesn't really provide for tables; you only have very
basic drawing instructions.

Without going too much into the nitty-gritty of PDF syntax, you have
instructions to:

  * Move to an x,y coordinate on the page

  * Draw a line from x1,y1 to x2,y2 in a specific color and line
    thickness

  * Display the text "Hello world" at coordinates x1,y1

With a little imagination you can see how these simple instructions
can be combined into a complex graphic that can visually represent a
table.

Thankfully, iText Core's layout engine constructs high-level
abstractions around these operations, so you don't have the pain of
directly dealing with the low-level PDF syntax. This means you are
probably more familiar with something more like the following
representation:

JAVA

        Document document = new Document(pdf);
        Table table = new Table(2);
        table.addCell("Cell1");
        table.addCell("Cell2");
        document.add(table);
        document.close();

Which results in something looking like this:

basic_table.png

What our simple table looks like when rendered

Conversely, using the previously mentioned low-level operations this
table would look like the following in actual PDF syntax:

PDF syntax cheat sheet: https://pdfa.org/wp-content/uploads/2023/08/
PDF-Operators-CheatSheet.pdf

TEXT

q                        % Start subsection (Drawing the text Cell 1)
BT                       % Begin text
/F1 12 Tf                % Select the specified font
38.5 790.83 Td           % Move text cursor x coordinate: 38.5, y coordinate 790.83
(Cell 1)Tj               % Show text string
ET                       % End text
Q                        % End section
q                        %  |
BT                       %  |
/F1 12 Tf                %  |
88.5 790.83 Td           %  |--> Same as above
(Cell 2)Tj               %  |
ET                       %  |
Q                        %  |
q                        % Start subsection (drawing of the left border )
0 0 0 RG                 % Select color black
0.5 w                    % Set line width to 0.5
36.25 806 m              % Move cursor to x coordinate 36.25 and y coordinate 806
36.25 783.02 l           % Create line to x coordinate 36.25 and y coordinate 783.02
S                        % Stroke the line
Q                        % End section
q                        % |
0 0 0 RG                 % |
0.5 w                    % |
86.25 806 m              % | -> Draw right border of cell 1
86.25 783.02 l           % |
S                        % |
Q                        % |

% omitted but this continues for the other borders

So, all this gives you a basic overview of what happens when you
create a table using iText's layout engine. As you can probably
guess, a lot of calculations need to happen to determine the exact
coordinates of those instructions, but that's out of scope for this
article. Although, if you want a blogpost going into more detail on
the inner workings of iText's layout engine, let me know!

What I'll be doing in this article is walking through how I
identified what was causing the table rendering slowdown, and the
steps I took to resolve it. With that said, let's go back to January
of this year when my investigation began.

Creating a Performance Baseline

As mentioned, I wanted to see where the most time was spent when
generating tables. So, I created the most basic table use case, and
decided to time the performance for generating 100 cells, 1,000
cells, 10,000, and finally 50,000 cells:

JAVA

public class App {
    public static void main(String[] args) throws IOException {
        //Do some warmup
        generatePdfWithTable("table_warmup.pdf", 1000);
        int[] amounts = {100, 1_000, 10_000, 50_000};

        for (int amount : amounts) {
            long start = System.currentTimeMillis();
            generatePdfWithTable("table_" + amount + ".pdf", amount);
            long end = System.currentTimeMillis();
            System.out.println("Time to generate " + amount + " cells: " + (end - start) + "ms");
        }
    }

    private static void generatePdfWithTable(String fileName, int amountOfCells) throws IOException {
        PdfDocument pdf = new PdfDocument(new PdfWriter(fileName));

        Document document = new Document(pdf);
        Table table = new Table(5);

        for (int i = 0; i < amountOfCells; i++) {
            table.addCell("Cell1");
        }
        document.add(table);
        document.close();
    }
}


Probably not the most optimal benchmarking code, but I thought it
should still do the trick. Running this code produced these results:

Number of Cells Time to Generate

100             7ms

1,000           53ms

10,000          767ms

50,000          35660ms

Plotting this data in a graph produced this:

runtime_graph.png

In this graph, "line go up mean world more badder"

This results in what looks suspiciously like an algorithm with
quadratic time complexity. I'm not going to get in to the details of
exactly what this is, so don't worry! But it should be pretty clear
that the generation time is not increasing proportionally -- a low
number of cells is OK(ish) while a high number of cells seems way
more than you'd expect.

Profiling

So, let's take a closer look by using flame graphs to track down
potential bottlenecks. Here's the comparison of running the above
sample with 1,000 cells, and 50,000 cells

1,000 cells

flamegraph_1000.png

The results of rendering our table with 1,000 cells

50,000 cells

flamegraph_50000.png

And now, with 50,000 cells

Interesting. We can see that almost all of the time is being spent in
just two methods:

  * com.itextpdf.layout.renderer.TableBorderUtil#
    createAndFillBorderList

  * com.itextpdf.layout.renderer.CollapsedTableBorders#
    getCollapsedList

And both of those methods seem to originate from the function call
com.itextpdf.layout.renderer.CollapsedTableBorders#getVerticalBorder.
Notice that we have Collapsed in the name, indicating that it seems
to happen when we use Collapsed borders.

What you might not notice is that this is also the default behaviour,
so let's take a look what happens when we run the same code and avoid
collapsing borders. By default, iText uses
BorderCollapsePropertyValue.COLLAPSE so we have to slightly modify
our code sample:

JAVA

table.setBorderCollapse(BorderCollapsePropertyValue.SEPARATE);

Results:

Number           Time to Generate                     Time to Generate
  of   BorderCollapsePropertyValue.SEPARATE BorderCollapsePropertyValue.COLLAPSE
Cells

100    9ms                                  7ms

1,000  59ms                                 53ms

10,000 558ms                                767ms

50,000 1601ms                               35660ms

Indeed, it seems using the collapsing border feature increases the
runtime by a lot -- I might even call it a performance bug if
Marketing lets me... Anyway, it seems that improving the collapsing
border feature is now our top priority. Let's investigate!

Remember that from the flame graph we saw that two functions are
responsible for almost all of the runtime. But both of those
functions seem to be the result of
calling com.itextpdf.layout.renderer.CollapsedTableBorders#
getVerticalBorder.

So, let's have a look at this function:

JAVA

@Override
public List<Border> getVerticalBorder(int index) {
    if (index == 0) {
        List<Border> borderList = TableBorderUtil
                .createAndFillBorderList(null, tableBoundingBorders[3], verticalBorders.get(0).size());
        return getCollapsedList(verticalBorders.get(0), borderList);
    } else if (index == numberOfColumns) {
        List<Border> borderList = TableBorderUtil.createAndFillBorderList(null, tableBoundingBorders[1],
                verticalBorders.get(verticalBorders.size() - 1).size());
        return getCollapsedList(verticalBorders.get(verticalBorders.size() - 1), borderList);
    } else {
        return verticalBorders.get(index);
    }
}

Walking through this code, we can see that if it's the outermost
border we calculate a border list, this is done to determine which
border color and width to take when collapsing. However, if it's an
inner column we can just take the border's width and color without
needing to collapse it.

Let's say we have a table in our code, but we know that after
initializing the verticalBorders and tableBoundingBorders the values
will not change. This means it's pointless to recalculate the
collapsed border list over and over again when processing each row,
as the results will always be the same.

Instead, we should cache those results when they are calculated. So,
let's implement the following to lazily calculate the results, and
then store it in verticalBorderComputationResult until they're
needed:

JAVA

    private final Map<Integer, List<Border>> verticalBorderComputationResult = new HashMap<>()

@Override
public List<Border> getVerticalBorder(int index) {
    //If not outermost we don't need to calculate collapsed borders
    if (index != 0 && index != numberOfColumns) {
        return verticalBorders.get(index);
    }
    if (verticalBorderComputationResult.containsKey(index)) {
        return verticalBorderComputationResult.get(index);
    }
    final int tableBoundingBordersIndex = index == 0 ? 3 : 1;
    final Border boundingBorder = tableBoundingBorders[tableBoundingBordersIndex];
    final List<Border> verticalBorder = verticalBorders.get(index);

    final List<Border> borderList = TableBorderUtil
            .createAndFillBorderList(null, boundingBorder, verticalBorder.size());
    final List<Border> result = getCollapsedList(verticalBorder, borderList);
    verticalBorderComputationResult.put(index, result);
    return result;
}

Now, we simply calculate once and return the cached results when we
want them. Seems promising, but let's verify the results of this
small refactoring of the function. Running our newly-optimized code
gives us the following:

Number           Time to Generate                     Time to Generate            With
  of   BorderCollapsePropertyValue.SEPARATE BorderCollapsePropertyValue.COLLAPSE  fix
Cells

100    9ms                                  7ms                                  7ms

1,000  59ms                                 53ms                                 49ms

10,000 558ms                                767ms                                485ms

50,000 1601ms                               35660ms                              1310ms

As you can see we managed to reduce the runtime from 35 seconds to
1.3 seconds, without adding a lot of complexity. Pretty good for an
hour of work. 

Tagging Tables

We're not done yet though. While I was working on general
improvements to table performance, I also wanted to check how tagged
tables compare to untagged tables.

[info-macro]

Tagging is quite a complex subject, but the only thing you need to
know for this blog is that it allows screen readers and other
accessibility tools to better understand the semantic meaning of the
contents of a PDF document - what is a heading, what is body text
etc.

More importantly for this article, there are specific tags to
identify content formatted as a table. You can find a lot more
information in the Tagged PDF Q&A from the PDF Association's site.

If you are using iText Core's layout engine (or indeed the pdfHTML
add-on), you can simply include this in your code to enable tagging:

JAVA

    PdfDocument pdf = new PdfDocument(new PdfWriter(fileName));
    pdf.setTagged();

This will ensure that your generated PDF file is tagged accordingly.
I won't go into the exact details of how iText does this now, but let
me know if you want more info or blogs on this topic.

So, let's return to our performance testing code. As you can see, the
only addition we've made is to enable tagging.

JAVA

private static void generatePdfWithTable(String fileName, int amountOfCells) throws IOException {
    PdfDocument pdf = new PdfDocument(new PdfWriter(fileName));
    pdf.setTagged();
    Document document = new Document(pdf);
    Table table = new Table(5);

    for (int i = 0; i < amountOfCells; i++) {
        table.addCell("Cell1");
    }
    document.add(table);
    document.close();
}

When we run this example and compare against our now improved
baseline results, we can see some major issues occurring.

Number of Cells Untagged tables Tagged tables

100             7ms             16ms

1,000           49ms            206ms

10,000          485ms           15637ms

50,000          1310ms          300018ms

Again, there's something fishy going on so let's dive into it. Once
more, we'll bring out the flame graphs to see where we could be
losing all this time:

flame_graph_tagged_is_flushed.png

Using flame graphs again to dig deeper

After poking though the code a bit, I saw the following in the
IntelliJ gutter:

kids_hint_lot_of_time.png

Flushing out the problem

Hold on, this single safety check takes 50% of the total runtime!
This seems like an ideal case for improvement.

For some context, flushing is the process where iText writes
completed pages to disk to keep memory usage low. So the check is to
make sure that you don't try to modify parts of the tagTreestructure
that have already been finalized. Since we can't just remove this
check, let's instead see what's happening in the getKids() method.

JAVA

@Override
public List<IStructureNode> getKids() {
    PdfObject k = getK();
    List<IStructureNode> kids = new ArrayList<>();
    if (k != null) {
        if (k.isArray()) {
            PdfArray a = (PdfArray) k;
            for (int i = 0; i < a.size(); i++) {
                addKidObjectToStructElemList(a.get(i), kids);
            }
        } else {
            addKidObjectToStructElemList(k, kids);
        }
    }
    return kids;
}

We see that we perform a rather expensive parsing operation to
convert low-level PdfObjects to high-level layout elements. But we
don't really do anything with the parsed information, just check if a
certain entry exists or not.

Let's try to rewrite this to not parse any of the PDF objects.

JAVA

public boolean isKidFlushed(int index) {
    PdfObject k = getK();
    if (k == null) {
        return false;
    }
    if (k.isArray()) {
        PdfArray array = (PdfArray) k;
        if (index >= array.size()) {
            return false;
        }
        return array.get(index).isFlushed();
    }
    return index == 0 && k.isFlushed();
}

What we've done here is completely eliminated the need for
conversion, since we just use iText's low-level PdfObject API.
Instead of looping and parsing over each element, it now just becomes
an array lookup + function invocation in cases where the PdfOjbect is
a PdfArray. And in the case of other PdfObjects we just invoke the
function.

For reference, you can see the full implementation on GitHub here:
https://github.com/itext/itext-java/commit/
71451319ebb9463d2c577bdaa89e4958e4ca2dd8

Let's see how much we've improved the speed.

Number of Cells Tagged tables Optimized kid lookup fix

100             16ms          19ms

1,000           206ms         203ms

10,000          15637ms       2761ms

50,000          300018ms      111790ms

Nice, so by avoiding unneeded computation we already made it about
three times faster. But we are still pretty far away from the
performance we get without tagging, so let's continue the
investigation.

Further Table Tweaks

After some further profiling, I found I could add a few more
optimizations. While the results were nice, the optimizations
themselves aren't super interesting. So, I'll just provide brief
descriptions here along with links to my commits if you want more
info.

Cache tagging hintkey

Here we always had to call into an expensive function to get a value
that didn't change, so simply caching the result once helped quite a
bit.

Commit: b50c34e14f012993f81a02500542edaa54d3ea5c

Remove duplicate function invocation

An expensive calculation was executed twice in the same function, in
which the result could have not changed in between. Storing the
result in the variable, and using that for further optimization
resulted in a drastic improvement.

Commit: b6b212971e285a4a800492f1d17651827b3de623

Do row adding in bulk

The previous implementation added new tags one by one. This caused a
lot of unneeded checks for each tag. To avoid those duplicate checks
we now gather all the tags, and then add them at once.

Commit: 0626cd422a275ac402aaa3aa34d92d17f934174f

After implementing these three minor improvements, let's check the
results now:

Number of Cells Tagged tables Optimized kid lookup fix 3 minor fixes

100             16ms          19ms                     20ms

1,000           206ms         203ms                    117ms

10,000          15637ms       2761ms                   1485ms

50,000          300018ms      111790ms                 18508ms

As you can see, even though the code changes are small and localized
we've managed to make it a further 5 times faster!

Next nearest sibling algorithm

The last commit of the patch set is quite interesting, and so it gets
its own special section. It showcases how adding a little heuristics
to an algorithm can drastically improve the runtime.

Let's say we have the following table:

Header Cell 1 Header Cell 2 Header Cell 3 Header Cell 4

Cell 1        Cell 2        Cell 3        Cell4

Cell 5        Cell 6        Cell 7        Cell8

If we would open the PDF and look at the tag structure, it would look
something like this.

TEXT

-Document
--Table
---Thead
-----th
-----span (Header Cell 1)
-----th
-----span (Header Cell 2)
....
---Tbody
-----td
------span (Cell 1)
-----td
------span (Cell 2)

(You might notice that there are no tr elements for table rows. This
is expected, as we only generate those dummy elements when we
finalize the table. This is because different PDF specifications
require us to have different structures.)

When adding a new cell to the table, we need to create the correct
tag and find the correct place to insert it in the tag tree. By using
the profiler I saw these were the offending lines.

JAVA

// Omitted for clarity
List<TaggingHintKey> parentKidsHint = getAccessibleKidsHint(parentKey);
int kidIndInParentKidsHint = parentKidsHint.indexOf(kidKey);
int ind = getNearestNextSiblingTagIndex(waitingTagsManager, parentPointer, parentKidsHint, kidIndInParentKidsHint);
// insert tag at index....  Omitted for clarity

The implementations then for these functions can be seen below:

JAVA

public List<TaggingHintKey> getAccessibleKidsHint(TaggingHintKey parent) {
    List<TaggingHintKey> kidsHint = kidsHints.get(parent);
    if (kidsHint == null) {
        return Collections.<TaggingHintKey>emptyList();
    }

    List<TaggingHintKey> accessibleKids = new ArrayList<>();

    for (TaggingHintKey kid : kidsHint) {
        if (isNonAccessibleHint(kid)) {
            accessibleKids.addAll(getAccessibleKidsHint(kid));
        } else {
            accessibleKids.add(kid);
        }
    }

    return accessibleKids;
}

private int getNearestNextSiblingTagIndex(WaitingTagsManager waitingTagsManager, TagTreePointer parentPointer,
                                          List<TaggingHintKey> siblingsHint, int start) {
    int ind = -1;
    TagTreePointer nextSiblingPointer = new TagTreePointer(document);
    while (++start < siblingsHint.size()) {
        if (waitingTagsManager.tryMovePointerToWaitingTag(nextSiblingPointer, siblingsHint.get(start))
                && parentPointer.isPointingToSameTag(new TagTreePointer(nextSiblingPointer).moveToParent())) {
            ind = nextSiblingPointer.getIndexInParentKidsList();
            break;
        }
    }
    return ind;
}

So, the first line again flattens the tree to a list recursively --
this initial implementation is pretty decent really. It's quick to
implement and in most cases your flattened sub-structure will only
contain very few tags. Since this operation isn't that expensive,
it's not worth spending the time to optimize it.

The current algorithm looks like this in pseudo-code:

 1. Flatten the whole tree to a list

 2. Start searching from the start to find the desired TaggingHintKey

 3. Start from the found index, and look for the next TaggingHintKey
    which has the same parent

The Trouble with Tagging

We know this works well in most scenarios. But in our table, the
tbody tag will have as many siblings as it has cells, and since this
code will always be executed for newly-added cells means it has to
flatten the tree again and again.

In the case of tables we know we always add cells in at the end of
the list -- in fact, we do this in most general tagging scenarios. So,
it does not make sense that we start looking from the beginning of
the list, since we know we will always be at the other end of it.

Determining Some Improvements

Now that we have a better understanding of the problem space, we can
optimize our algorithm to take into account the heuristics we have.

 1. Start processing recursively backwards until you find the desired
    TaggingHintKey to start from

 2. When found, start looking ahead to find the next sibling

As you can see, if we do this we completely avoid the two pitfalls
which currently slow down the table creation process.

So, the implementation of this new algorithm would look something
like this.

JAVA

// ommited  for clarity
int ind = getNearestNextSiblingTagIndex(waitingTagsManager, parentPointer, parentKidsHint, kidIndInParentKidsHint);
// insert tag at index....  Omitted for clarity

JAVA

private int getNearestNextSiblingIndex(WaitingTagsManager waitingTagsManager, TagTreePointer parentPointer,
                                       TaggingHintKey parentKey, TaggingHintKey kidKey) {
    ScanContext scanContext = new ScanContext();
    scanContext.waitingTagsManager = waitingTagsManager;
    scanContext.startHintKey = kidKey;
    scanContext.parentPointer = parentPointer;
    scanContext.nextSiblingPointer = new TagTreePointer(document);
    return scanForNearestNextSiblingIndex(scanContext, null, parentKey);
}

private int scanForNearestNextSiblingIndex(ScanContext scanContext, TaggingHintKey toCheck, TaggingHintKey parent) {
    if (scanContext.startVerifying) {
        if (scanContext.waitingTagsManager.tryMovePointerToWaitingTag(scanContext.nextSiblingPointer, toCheck)
                && scanContext.parentPointer.isPointingToSameTag(
                new TagTreePointer(scanContext.nextSiblingPointer).moveToParent())) {
            return scanContext.nextSiblingPointer.getIndexInParentKidsList();
        }
    }
    if (toCheck != null && !isNonAccessibleHint(toCheck)) {
        return -1;
    }
    List<TaggingHintKey> kidsHintList = kidsHints.get(parent);
    if (kidsHintList == null) {
        return -1;
    }


    int startIndex = -1;
    if (!scanContext.startVerifying) {
        for (int i = kidsHintList.size() - 1; i >= 0; i--) {
            if (scanContext.startHintKey == kidsHintList.get(i)) {
                scanContext.startVerifying = true;
                startIndex = i;
                break;
            }
        }
    }


    for (int j = startIndex + 1; j < kidsHintList.size(); j++) {
        final TaggingHintKey kid = kidsHintList.get(j);
        final int interMediateResult = scanForNearestNextSiblingIndex(scanContext, kid, kid);
        if (interMediateResult != -1) {
            return interMediateResult;
        }
    }

    return -1;
}

private static class ScanContext {
    WaitingTagsManager waitingTagsManager;
    TaggingHintKey startHintKey;
    boolean startVerifying;
    TagTreePointer parentPointer;
    TagTreePointer nextSiblingPointer;
}
}

OK, so let's plug this snazzy new tagging algorithm into the code and
see what we get:

 Number of    Tagged     Optimized kid     3 minor   Optimized search
   Cells      tables       lookup fix       fixes          algo

100         16ms       19ms               20ms       19ms

1,000       206ms      203ms              117ms      120ms

10,000      15_637ms   2_761ms            1_485ms    1_072ms

50,000      300_018ms  111_790ms          18_508ms   7_525ms

Alright, that's tagging a 50k cell table almost 40x faster. Not bad
for an afternoon's work, eh?

Conclusions

And so we come to the end of Guust's Adventures in Optimization Land!
But in all seriousness, I wanted to highlight how these improvements
were achieved since it shows how profiling and a little curiosity can
lead to huge performance wins with minimal risk.

All the above table improvements were including in the iText Core
9.1.0 release in February. If you're generating PDFs at scale or
using collapsed borders/tagging, I highly recommend checking it out
as as you should get some free performance improvements.

Also, if you're tackling similar bottlenecks feel free to reach out --
I'm more than happy to nerd out on this stuff. 

Written by Guust Ysebie, Software Developer, iText SDK

x
[footer-log]
About Us / News / Jobs / Open Source AGPL license / Commercial & OEM
licenses

  * Copyright (c) 2025 iText
  * * Powered by  Scroll Viewport  &  Atlassian Confluence
  * 

JavaScript errors detected

Please note, these errors can depend on your browser setup.

  * 

If this problem persists, please contact our support.

Contact Support Close