[HN Gopher] Insecure Features in PDFs (2021)
       ___________________________________________________________________
        
       Insecure Features in PDFs (2021)
        
       Author : todsacerdoti
       Score  : 58 points
       Date   : 2024-02-26 08:57 UTC (14 hours ago)
        
 (HTM) web link (web-in-security.blogspot.com)
 (TXT) w3m dump (web-in-security.blogspot.com)
        
       | nness wrote:
       | Why PostScript had to be Turing-complete makes no sense to me.
       | Loops, code-execution, functions, it all seems so unnecessary for
       | a markup and presentation language.
        
         | mananaysiempre wrote:
         | Turing completeness isn't really a huge problem (and not
         | included in that part of PDF anyway). PostScript in its printer
         | application doesn't really have I/O except for, well, the
         | printer, and no raw pointers, so there's a smaller surface
         | than, say, JavaScript (no network access).
        
         | bee_rider wrote:
         | It was the early 80's, probably the flexibility seemed nice,
         | and the riskiness of putting a full programming language in
         | your documents was probably not as obvious.
         | 
         | Thankfully we learned from them and didn't repeat that mistake
         | over and over again.
        
         | linguae wrote:
         | PostScript does more than markup and presentation; it is an
         | entire 2D graphics engine. At one point PostScript served as
         | the graphics substrate for the Sun NeWS and NeXT Display
         | PostScript-based window systems. While I agree from a security
         | standpoint that its Turing-completeness poses challenges, it
         | also makes it easier to express certain constructs such as
         | complex shapes and fonts programmatically.
         | 
         | I'm not a PostScript expert but I've been reading a lot about
         | it recently. It's a rather fascinating system for 2D graphics.
        
       | mlinksva wrote:
       | Though it barely mentions security
       | https://willcrichton.net/notes/portable-epubs/ is one of my
       | favorite essays to be posted to HN in awhile
       | https://news.ycombinator.com/item?id=39138042
       | 
       | I wonder how one would measure the costs and benefits (with a
       | focus on security) of speeding up and making more security-driven
       | the gargantuan task of shifting to a better, well, portable
       | document format. People are thinking big and want measurement, eg
       | just posted https://news.ycombinator.com/item?id=39514844 WH
       | release on memory safety
       | https://www.whitehouse.gov/oncd/briefing-room/2024/02/26/pre...
       | would it not make sense to be similarly ambitious and metrics-
       | driven for this too?
        
         | nonrandomstring wrote:
         | Lot of situations I use
         | 
         | pdftotext file.pdf - | nroff | less
         | 
         | pdftotext is from Poppler Developers
         | http://poppler.freedesktop.org Glyph & Cog, LLC
         | 
         | nroff is a GNU common util on most linux/unix systems
         | 
         | (though I don't trust poppler utils to be secure)
        
           | mlinksva wrote:
           | Yeah, I wasn't referring to the security of tools used to
           | process PDFs, including memory-safe rewrites, but that too!
           | Instead to whether incentivizing migration to a safer
           | portable document format couldn't be justified and done in a
           | similar way to how memory safety is being approached now.
           | 
           | Edit: re-reading I guess your point is that you can use tools
           | to extract text from PDFs and then read without worry. That
           | brings up another super annoying thing about PDFs--it's can
           | be hard to extract text from them with high fidelity.
        
         | mdaniel wrote:
         | The irony doesn't escape me that a link entitled "portable
         | epubs" loads infinitely with the ... ahem, helpful ... text of
         | "To work around the bug, you either need to close any other
         | tabs of this document (in Google Chrome), or try a different
         | browser."
         | 
         | In practice, it's a simple JS kaboom that they didn't catch,
         | because error handling is for n00bs
        
       | mistrial9 wrote:
       | comments welcome from Leonard Rosenthol ADOBE, fifteen years as
       | PDF architect?
        
       | jimjimjim wrote:
       | I've had to fix the outline/bookmark loop problem before. It is
       | quite an unexpected problem. There are billions of documents out
       | there and who knows how many different pdf generators or editors
       | (writing a pdf is easy, reading a pdf is insanely difficult).
       | I've had normal non-malicious documents where the software trying
       | to merge documents or move pages has caused the outline/bookmark
       | tree to have a loop.
        
       ___________________________________________________________________
       (page generated 2024-02-26 23:00 UTC)