[HN Gopher] Hacking with PDF (2022)
___________________________________________________________________
Hacking with PDF (2022)
Author : lnyan
Score : 81 points
Date : 2024-08-17 13:03 UTC (9 hours ago)
(HTM) web link (0xcybery.github.io)
(TXT) w3m dump (0xcybery.github.io)
| banku_brougham wrote:
| This is a great demo, ive been concerned about all these pdfs i
| like to read, this gives me a little more confidence about tools
| to scan odfs for attacks.
| JKCalhoun wrote:
| FWIW, ages ago I wrote the PDFKit framework for the Mac (used by
| Preview and the built-in PDF viewer in Safari).
|
| The only exploit listed here that has a chance of working with
| Preview/Safari (PDFKit) is the URI one -- none of the Javascript
| exploits will work.
|
| Why? I never implemented Javascript support [1].
|
| Security was extremely important at Apple (there's a whole
| security team that frequently interact with the various project
| owners around the company, write and deploy file fuzzers, create
| must-fix Radars around exploits found in the wild, etc.).
|
| In fact though I had no idea how I would hoist a Javascript
| runtime and I didn't really have the cycles to implement it if I
| had known how to. Anyways we were content to support the 99% of
| PDFs out there.
|
| [1] In fact there were a few US tax documents that used very
| simple Javascript snippets to take the values from two fields,
| add them, and put the result in a third. Some code in PDFKit I
| added would identify these few very simple patterns and implement
| them sans JS runtime.
| felipefar wrote:
| Nice job! I've been wanting to write a PDF parser for learning
| purposes, but have been put off by the quantity of files that
| open source PDF parsers have on their repos and the different
| tech that they need (image formats, compression formats, etc.).
| I'll probably settle for a reasonable ratio between PDFs
| supported/learning extracted from the project, so it's useful
| knowing that PDFs with JS are not very widely used.
|
| Also, I'm the developer of a reference management software, and
| have naturally been thinking about what it'd take to save in
| the PDF file metadata fields that are generally useful for
| advanced readers and academics: original publication dates,
| ISBNs, DOIs, edition, publisher, etc., instead of just author
| and title.
| gettalong wrote:
| You can get a long way with only implementing the most basic
| things of the PDF specification, like section 7. And even
| there you don't need everything. For example, there is no
| need to implement the CCITTFaxDecode, JBIG2Decode, DCTDecode
| or JPXDecode filters if you don't want to get at the raw
| pixels of the images.
|
| Once you have parsing and writing of a simple PDF file going
| (sections 7.2, 7.3, 7.4, 7.5, 7.7), add in support for
| encryption (section 7.6). Now you are able to handle to at
| least parse and write nearly all PDF files.
|
| Then implement all the things you need gradually For example:
|
| * Need support for parsing or creating the contents of a
| page? -> sections 7.8, 8, and 9. Mind you, start out with
| only supporting the built-in PDF fonts for creating text and
| later add support for TrueType (easier) and OpenType (harder
| if you need to implement the font parser yourself).
|
| * Need support for annotations? -> section 12.5
|
| And so on.
|
| If you just need to store the metadata in the PDF, you only
| need support for parsing and writing a PDF because this
| usually also entails that you can modify the PDF object tree
| which is needed for storing the metadata. However, if you
| need to store that metadata in a way that is usable for other
| PDF processors, you would need to store it as an XMP file and
| creating that is yet another deep dive if you don't have an
| XMP library available. See section 14.3.2 in the PDF spec for
| this (btw. the latest PDF spec is available at no cost at
| https://pdfa.org/resource/iso-32000-2/).
| lysace wrote:
| PDFKit is awesome to use. Thanks!
| jahewson wrote:
| Nice! There was an exploit in iOS Messages found last year due
| to code that Apple had under license from Xpdf. I've wondered
| why Apple needed that when they already had PDFKit?
| bla3 wrote:
| Do you know if it's still maintained? I have a bunch of PDFs
| where images don't show up in Preview. I filed bugs for them,
| but they're being ignored.
| jjbinx007 wrote:
| I've always held the opinion that viewing PDFs in something other
| than Adobe Acrobat gives the user more of a chance of avoiding
| such attacks... is there any credence to this or is it just
| wishful thinking?
| unanimous wrote:
| I've tried creating a PDF Canarytoken [0] and opening it in a
| few applications not including Adobe Acrobat. None of them
| triggered the canary.
|
| [0]: https://canarytokens.org/nest/
| agumonkey wrote:
| Acrobat implements more features than say muddy I assume. So in
| terms of attack surface it would be riskier, But maybe they
| have more security analysts too..
___________________________________________________________________
(page generated 2024-08-17 23:00 UTC)