[HN Gopher] Exploiting aCropalypse: Recovering truncated PNGs
       ___________________________________________________________________
        
       Exploiting aCropalypse: Recovering truncated PNGs
        
       Author : Retr0id
       Score  : 123 points
       Date   : 2023-03-18 12:58 UTC (10 hours ago)
        
 (HTM) web link (www.da.vidbuchanan.co.uk)
 (TXT) w3m dump (www.da.vidbuchanan.co.uk)
        
       | johdo wrote:
       | >Google was passing "w" to a call to parseMode(), when they
       | should've been passing "wt" (the t stands for truncation). This
       | is an easy mistake, since similar APIs (like POSIX fopen) will
       | truncate by default when you simply pass "w". Not only that, but
       | previous Android releases had parseMode("w") truncate by default
       | too! This change wasn't even documented until some time after the
       | aforementioned bug report was made.
       | 
       | Reading about this silent API change makes me feel like I'm
       | losing braincells. What's going on with the processes behind
       | Android's development?
        
         | BoorishBears wrote:
         | I have some pretty unique insight into this since I work with
         | AOSP a lot and have worked with a few engineers on Android's
         | core system apps:
         | 
         | Google's engineers working on Android at a system level
         | regularly break basic functionality in the "userspace"*.
         | Google's engineers working on Android apps get early access to
         | the Android versions and work through the resulting bugs,
         | bubbling them back up until they get fixed.
         | 
         | (*userspace being used loosely here, it's all userspace vs
         | being in a kernel, but it's interfaces that are implemented at
         | the OS level and consumed at the app level)
         | 
         | Like Google is large enough that I'm sure someone will take
         | offense to implying that such questionable engineering takes
         | place there, but this isn't a story I've heard just once.
         | People working on apps that are part of every GMS enabled
         | Android image have confirmed this completely bizarre setup on
         | multiple separate occasions
        
       | Groxx wrote:
       | > _IMHO, the takeaway here is that API footguns should be treated
       | as security vulnerabilities._
       | 
       | Yeah, especially in this case, due to changing defaults and
       | similar-but-differently-behaving APIs.
       | 
       | Defaults really suck sometimes. But so does not having any. And
       | so many things can become security issues when used _just so_.
       | 
       | :/
        
         | olliej wrote:
         | See that's not what happened here. It wasn't that the API had a
         | footgun (I'll leave out "is this API actually good"). It was
         | that someone decided that changing core API behaviour after
         | that library had shipped was acceptable - and it isn't.
         | 
         | That's why shipping a new API requires a lot of time investment
         | in the design of the API: once an API is shipped you can't just
         | change the behavior dramatically.
        
       | ericpauley wrote:
       | I would assume that any image reformatting or exif stripping by
       | online platforms would protect against this. Yet another good
       | reason to include this when developing apps.
        
         | olliej wrote:
         | This isn't an exif issue.
         | 
         | This isn't a metadata issue.
         | 
         | An underlying IO library changed its behavior so that instead
         | of truncating a file when opened with the "w" mode (as fopen
         | and similar have always done, and this API did originally), it
         | left the old data there. If the edited image is smaller than
         | the original file, then the tail of the original image is left
         | in the file. There is enough information to just directly
         | decompress that data and so recover the pixel data from the end
         | of the image.
         | 
         | You're not necessarily recovering the edited image data, just
         | whatever happens to be at the end of the image. If you are
         | unlucky (or lucky depending on PoV) the trailing data still
         | contains the pixel data from the original image - in principle
         | the risk is proportional to how close to the bottom right of
         | the image the edits were (depending on image format).
        
           | ericpauley wrote:
           | Not saying it is. Sensible exif stripping (re-serialization)
           | also has the upside of removing trailing data, which would
           | prevent this.
        
         | Retr0id wrote:
         | EXIF stripping won't necessarily catch it (but probably would
         | in most instances - depends on how you do it), but reformatting
         | or reencoding will.
        
           | ericpauley wrote:
           | I'm guessing most exif stripping would deserialize the image
           | and write a new file, so unless that has the same bug as this
           | (overwriting the existing file without truncation), it ought
           | to work?
        
             | jsheard wrote:
             | Discord strips EXIF but the author was still able to
             | unredact the images they'd posted there.
             | 
             | Some implementations of EXIF stripping might help, but it's
             | not guarenteed.
        
               | Retr0id wrote:
               | Discord doesn't strip EXIF from PNGs, only JPEGs
        
               | jsheard wrote:
               | Seriously? What's the reasoning behind that?
        
               | Retr0id wrote:
               | It's rare to see PNGs in-the-wild containing EXIF data,
               | it's a feature that's only been in the spec since ~2017.
               | I'm actually looking for one to double-check my statement
               | about discord, but I can't find any.
               | 
               | Edit: I made my own. I can confirm that the exif chunk
               | was not stripped. https://cdn.discordapp.com/attachments/
               | 541730746805649476/10...
        
             | Retr0id wrote:
             | A naive approach to stripping EXIF from a PNG would be to
             | parse up to the start of the first eXIf chunk, discard the
             | contents of that chunk, and then include the rest of the
             | file verbatim without actually parsing anything.
             | 
             | But yes, a more sensibly coded EXIF stripper would
             | deserialise and reserialise. Unfortunately I am no longer
             | able to assume that programmers will behave sensibly.
             | 
             | Edit: Also, the PNGs generated by Markup don't contain EXIF
             | in the first place, so an EXIF stripper could reasonably
             | decide that no changes are necessary at all.
        
       | Waterluvian wrote:
       | EXIF metadata is useful but we strip it when we post an image
       | because it's also a security vulnerability.
       | 
       | Image edit metadata also seems like an incredibly useful feature.
       | Do we just strip it as well?
        
         | Retr0id wrote:
         | Since you read the article beforehand, you know that this
         | comment is entirely orthogonal to the vulnerability in
         | question.
        
           | Waterluvian wrote:
           | I think it's okay to talk about the core issue that leads to
           | that. From the linked tweet it looks like there's edit data
           | stored in the image, allowing the original to be recovered?
           | 
           | Do you have a specific concern to warrant your comment?
        
             | Retr0id wrote:
             | It's not the core issue, and it's misleading to suggest
             | that it is. I suggest reading the aptly named "Root Cause
             | Analysis" section of the linked article.
        
               | Waterluvian wrote:
               | I'm trying to follow the article. So it's not the image
               | format specifically that is holding on to the blacked out
               | pixels, it's the compression method that the image format
               | uses, or more specifically, how Google's code is handling
               | that work?
               | 
               | Is this possibly a helpful feature or is it really just a
               | terrible hack/bug that has no practical use holding on to
               | a sort of edit history inside a PNG?
               | 
               | I would love a way to track some level of history in a
               | commonly supported image format (but of course being
               | aware of needing to strip it when appropriate)
        
               | Retr0id wrote:
               | It's neither a feature nor a hack, it's simply a bug
               | related to missing the O_TRUNC flag when opening the file
               | for modification. No deliberate attempt was made to "hold
               | onto" any data.
        
               | Waterluvian wrote:
               | I feel like we're talking past each other. I'll find my
               | answers elsewhere. Thanks for your time!
        
               | tslater2006 wrote:
               | My (limited) understanding is that if you have say a 5mb
               | file, and you open it for writing and wrote 3mb. You
               | might expect the file to be 3mb, but...if you didn't
               | specify the truncate flag (the bug here) the file is
               | still actually the 5mb it was. The image appears cropped
               | because the relevant metadata has the new sizes etc, but
               | that 2mb of extra data is still there by mistake. This
               | can be used to recover some of the original image.
        
               | Waterluvian wrote:
               | Aha! Thanks.
               | 
               | So this behaviour is likely unfit to be used as a
               | feature, but could in some cases be used as a clever hack
               | to, in a way, preserve some edited (cropped) data.
        
               | Retr0id wrote:
               | No, it is not suitable as a clever hack. It doesn't work
               | reliably or well enough for that.
        
       | Juerd wrote:
       | Very simple script to check which PNG files have trailing data:
       | https://gist.github.com/Juerd/40071ec6d1da9d4610eba4417c8672...
       | 
       | I hope that people who host forums, image boards, chat
       | applications, etc., will delete or fix potentially vulnerable
       | images before anyone uses them maliciously.
       | 
       | One way to repair a vulnerable image is to use `optipng -fix`.
        
       | zorlack wrote:
       | It'd be so interesting to collect aCropalypse-affected images.
       | Maybe you could build a crop-suggester out of it...
       | 
       | Not that I'd want to maintain custody of such a dataset...
        
       | Wingman4l7 wrote:
       | Why the hell is this exploit being fully provided for use via a
       | handy-dandy web interface? An image /cleanup/ tool is one
       | thing... this is very irresponsible.
        
         | moosedev wrote:
         | That was my first thought when I clicked on the website link in
         | the Twitter thread -- expecting a disclosure/high-level info
         | page in the fashion of the last decade of big-deal exploits
         | with cute names -- and found only a tool the tweet author (not
         | OP, but apparently working with him?) built that runs in-
         | browser, requires no knowledge/setup, and appears to enable
         | recovery of cropped-out image data at scale by even non-
         | technical users. Jeez.
         | 
         | Edit: I find myself wryly weighing this against the ongoing
         | unleashing of LLMs upon the world. Both have shades of clever
         | people prioritizing being and demonstrating clever at the cost
         | of... other stuff. On the bright side, it is distracting me
         | from facepalming at the underlying Pixel bug.
        
         | andersa wrote:
         | I wonder if hiding the tool would help. Anyone interested could
         | simply archive and hoard potentially interesting images until
         | such tool emerges later. So in reality, it would change
         | nothing, only slightly delay the images being extracted.
         | 
         | The only thing I can think of that would have made a real
         | difference is to send a tool to fix the images to all image
         | hosting platforms in advance. But which ones do you trust?
        
       | Retr0id wrote:
       | Earlier related discussion
       | https://news.ycombinator.com/item?id=35207787
        
       ___________________________________________________________________
       (page generated 2023-03-18 23:01 UTC)