[HN Gopher] Unit Testing PDF Generation
___________________________________________________________________
Unit Testing PDF Generation
Author : ingve
Score : 44 points
Date : 2023-02-27 17:57 UTC (5 hours ago)
(HTM) web link (nibblestew.blogspot.com)
(TXT) w3m dump (nibblestew.blogspot.com)
| t344344 wrote:
| PDF may have generation date etc, much better to use OCR and
| compare strings.
| crazygringo wrote:
| No, at the end of the day the proposed approach of rendering to
| an image and comparing pixels is best. Things can go wrong
| graphically that OCR won't catch, like an entire background
| color is missing or an image is missing.
|
| If you're worried about a generation date in the margin, then
| compare inside of a bounding box that includes most of the page
| but not that margin. Or just use a fixed date for the test,
| even better -- since otherwise you've got to be careful about
| running the test within a few seconds of midnight anyways.
| ks2048 wrote:
| The example here is drawing a red rectangle, so OCR won't do
| anything.
| zubspace wrote:
| We do something similar, but from my experience small changes,
| like fonts or lines rendering a tad different after library
| changes can be quite frequent. Usually small changes you can't
| really see, only if you compare them as 2 layers in paint.net or
| something.
|
| Adding something like an error margin for all pixels or
| subsections sometimes makes sense, but this can be tricky.
| Downscaling the image and comparing grayscale values with a small
| error margin is another option. It all depends on how accurate
| your tests have to be.
| izacus wrote:
| Well, but those changes are triggered by something aren't they?
| So when you upgrade your font lib or pdf rendering library,
| you're warned that you're now generating different output and
| can update the golden set.
|
| Your dependencies aren't changing without a cause are they?
| zubspace wrote:
| Yeah sure, it just starts to be a problem when you're having
| dozens of tests failing because of small rendering changes
| which can be ignored. Someone still has to look at all the
| test output, compare it to the old state and update the tests
| with the new state. In our case this happened quite a lot.
|
| This is not an issue at first, but the more you use tests
| like this and the more people work with your code, false
| positives start to drag you down.
| [deleted]
| eddsh1994 wrote:
| It's interesting how different peoples use of testing terminology
| is across teams/companies/professions. Vocab is standardized by
| various ISO's, ASQ, and ISTQB so we could all share the same
| language, then we don't have to debate about what
| integration/unit/smoke/component/regression/golden/snapshot
| testing means
| jimjimjim wrote:
| This can get very difficult. Especially with pages that are more
| than just text and images. Lines, interactive content, optional
| layers, annotations, embedded content, blend mode transparencies.
| All of this and more make things complex.
|
| The real problem is that reading a pdf is vastly more complex
| than writing a pdf.
|
| The spec (1000+ pages) is open to interpretation and different
| readers interpret it differently. A page that might render
| perfectly in adobe may look different when viewed in firefox or
| chrome or ghostscript.
| [deleted]
| flandish wrote:
| Isn't testing the physical generation of a pdf more aligned with
| "integration" test not unit testing? Testing the api that makes
| the pdf is ok, but testing like this post suggests, with bitwise
| comparison is integration testing, no?
| DSMan195276 wrote:
| > Isn't testing the physical generation of a pdf more aligned
| with "integration" test not unit testing? Testing the api that
| makes the pdf is ok, but testing like this post suggests, with
| bitwise comparison is integration testing, no?
|
| The fact that it writes the PDF out to a file potentially makes
| it an integration test, but the rendering aspect I don't think
| so. The poster is not testing the integration of the tool with
| ghostscript, rather ghostscript is simply used as an oracle for
| verifying the result. The only thing actually tested is the
| original a4pdf API, but some way of verifying the resulting PDF
| was needed, which is what ghostscript accomplishes. Effectively
| it's no different from a fancy assertion.
| flandish wrote:
| I reckon so. It could align nice with a mock fs, I suppose.
|
| But if differences in fs or architecture are crucial - the
| real proof is in the integration.
| kccqzy wrote:
| I have a much more liberal view of what constitutes a unit
| test: everything that can be run inside a single container is
| a unit test. Writing files? Unit test. Using databases? As
| long as that database is started by the test fixture in the
| same container and destroyed along with the container, still
| a unit test.
|
| Of course, if your test needs a database the natural follow-
| up question is whether it can populate the database with data
| known at build time, or it needs to reach out to get some
| realistic looking data. Only the latter makes it an
| integration test.
| izacus wrote:
| Is naming these tests a seriously useful thing to bikeshed on?
| PaulStatezny wrote:
| There is a distinct and meaningful difference between unit
| tests and integration tests. flandish is not bikeshedding.
|
| Unit tests are about testing a single unit in isolation.
| Integration tests are about testing the integration of
| multiple units.
|
| With unit tests, the industry's general attitude is that
| there should be no side effects, such as reading/writing to
| databases or the disk. Side effects are generally embraced
| for integration tests, on the other hand.
|
| As a result, unit tests are mostly useful for "pure"
| functions, ones where the output is 100% derived from the
| input, regardless of any state external to the function.
| (Such as database records.) However, a large portion of the
| industry hasn't realized this and so you get millions of
| lines of dependency-injected unit tests that really don't
| provide much value in terms of catching actual bugs. (If
| these tests were integration tests, they'd catch actual bugs
| 10x more often.)
|
| A unit test for generating a PDF will not actually involve
| writing a PDF for disk. An integration test, however, might.
|
| So as I said, this isn't bikeshedding. ;-)
| leni536 wrote:
| Well, the blog post could have just called it "test", and
| nobody would bikeshed it.
| mardifoufs wrote:
| Having well defined terms, and using them well, is essential
| to any type of engineering. I don't know why aiming for
| precise terminology is only controversial in software
| engineering.
| flandish wrote:
| ...yes. Because different energy, documentation, and
| sometimes entire groups of people are on different phases.
|
| It's not always a single 100x-elite-monster-drinking coder
| cranking out monoliths in a silo.
|
| I have a hard enough time with project management getting it
| wrong:
|
| - testing an api's public methods is far "faster" than
| testing how files are made on diff procs or fstabs..
|
| - that translates to silly gantt charts...
|
| You get the idea.
| gbro3n wrote:
| This is more of an integration test than a unit test. And if
| you're going to test for a pixel perfect image match, why not
| check for full equality with a pre-existing PDF file, byte for
| byte? And then what are you testing? That something has changed?
| You'd likely know that the output was going to change, so to fix
| the broken test you need to use the failure result to create the
| new comparison file, and if your always going to use the failure
| output as an input for correcting the test, what is the point.
| "Don't test that the code is like the code" is a similar
| principle.
| gbro3n wrote:
| *you're
| systems_glitch wrote:
| At a previous job, we created a PDF visual diff tool for this.
| In automated tests, we could look for either red (present in
| sample but not test output) or green (not present in sample,
| but present in test output) to fail a test, or issue an
| automated change approval request.
| tantalor wrote:
| It's more like "golden" or "snapshot" testing.
|
| These are _very_ common for web apps, because at the end of the
| day you don 't care about the actual html & CSS, only how they
| are rendered.
|
| > This is more of an integration test than a unit test.
|
| That's debatable. An integration test generally tests 2 or more
| systems. This kind of test has 2 systems, the generator and the
| renderer, and we care about the output of the renderer, so it
| kind of looks like an integration test. However in an
| integration test you also have control over the implementation
| of both systems; a regression can be in any of the systems. But
| that's not true in snapshot tests: the renderer is a given. If
| the test fails, it's very unlikely to be due to a regression in
| the renderer. So in that sense, you are really only testing a
| single component (the generator) hence it is more like a unit
| test.
| zoover2020 wrote:
| Bravo! Excellent summary
| DavidSJ wrote:
| This sort of tests can be useful when you change things under
| the hood in such a way that the output shouldn't have changed.
| ks2048 wrote:
| I've never seen a clear distinction between unit tests and
| integration tests. If you have a black box, "F", with
| input/output pairs you want it to replicate, you encode these
| and call them "tests of 'F'". Why have different names for
| whether "F" is simple or complex?
| simonw wrote:
| I find the distinction between the two extremely frustrating.
|
| Some people act like there's an obvious definition, and maybe
| there is if you're doing pure TDD Java as described in one
| specific text book... but in my experience most developers
| can't provide a good explanation of what a "unit" is.
|
| And those that do... often write pretty awful tests! They
| mock almost everything and build tests that do very little to
| actually demonstrate that the system works as intended.
|
| So I just call things "tests", and try to spend a lot more
| time on tests that exercise end-to-end functionality (which
| some people call "integration tests") than teats that operate
| against one single little function.
| CiaranMcNulty wrote:
| They both revolve around a coherent concept of what a 'Unit'
| is - if you have a (shared, project-level) understanding of a
| Unit then a 'Unit test' is what tests it, and an 'Integration
| test' involves >1 Unit
| patrickg wrote:
| This is what I do
|
| I have ca. 190 test cases on which I run my software and
| compare the md5 sums of the resulting PDF. If they are not the
| same, I create a PNG for every page and compare visually with
| imagemagick.
|
| The trick is to remove all random stuff from the PDF (like ID
| generation or such).
|
| This takes about 3 seconds on the M1 Pro laptop. I think this
| is very much okay.
|
| Links: https://github.com/speedata/publisher/tree/develop/qa
| (the tests)
| https://github.com/speedata/publisher/blob/develop/src/go/sp...
| (the Go source code for the comparison)
| mattgreenrocks wrote:
| These are typically called smoke tests, and can be valuable for
| regression testing of third party libraries you depend on.
|
| An alternate approach: generate the PDF, then run it through a
| PDF reader library to scrape the text out and ensure it is
| there.
| izacus wrote:
| Your approach will completely miss big changes like missing
| pictures, broken layout, missing background and other
| breakages in rendering. Also missing text which isn't
| embedded as a text layer.
| mattgreenrocks wrote:
| Of course. It was meant for argument, not as an omnibus to
| comprehensively testing PDFs. :)
| geraldwhen wrote:
| It is extremely hard to make two pdfs have the same output
| binary, especially on CI vs local.
___________________________________________________________________
(page generated 2023-02-27 23:01 UTC)