[HN Gopher] ExifTool CVE-2021-22204 - Arbitrary Code Execution (...
___________________________________________________________________
ExifTool CVE-2021-22204 - Arbitrary Code Execution (2021)
Author : ekiauhce
Score : 89 points
Date : 2024-01-27 11:58 UTC (11 hours ago)
(HTM) web link (devcraft.io)
(TXT) w3m dump (devcraft.io)
| fulafel wrote:
| Perl and eval, not what I'd have expected from the title!
| rkagerer wrote:
| Indeed! Arbitrary text was smuggled into the eval() statement
| through a weak input sanitization step:
|
| _The second quote was not escaped because in the regex $tok =~
| /(\\\\+)$/ the $ will match the end of a string, but also match
| before a newline at the end of a string, so the code thinks
| that the quote is being escaped when it's escaping the
| newline._
|
| This is why I hate using regexes, or deciphering other people's
| code which uses them. The syntax just isn't obvious. We need
| tooling that _removes_ mental overhead, not adds to it. I 've
| always found plain old procedural parsing code easier to
| formulate, trace and reason about.
|
| I'm not some some fanatic preaching they should be banned from
| all languages, but I do feel there are places they're
| inappropriate and input sanitization is one of them. It's also
| good practice to centralize this sort of logic in some flavor
| of escape() function to reduce the whack-a-mole when bugs like
| this are found.
| jodrellblank wrote:
| > " _will match the end of a string, but also match before a
| newline at the end of a string._ "
|
| Is it just me, or is this a poor description? I understood $
| to be an anchor for end of _line_ , not end of _string_ ,
| i.e. "match before a newline anywhere". The emphasis "newline
| _at the end of a string_ " feels misleading; the exploit
| works because the matched newline isn't at the end of the
| string and there is exploit code after it.
|
| > " _I 've always found plain old procedural parsing code
| easier to formulate, trace and reason about._"
|
| A regex is a domain specific language which lets you express
| a lot of computation in a small amount of code. At the
| extreme, you're saying "I can write a regex engine easier
| than I can write a regex". "I can write assembler easier than
| I can write C". "I can write a list of a thousand items
| quicker than I can write range(1000)". It doesn't make sense,
| and for anything more than trivial cases, I don't believe it.
|
| (I'll agree that there are places they are inappropriate,
| expressions which aren't clear, and maybe the extra effort of
| doing it by hand is worth it in security sensitive
| situations, but I claim it is extra effort and less clear
| what a pile of ifs/loops/switch is trying to achieve vs
| "[a-f]{3}(\d)" or whatever).
| joosters wrote:
| The meaning of ^ and $ in perl regexes can be altered by
| modifiers at the end of the regex. A 'multi-line' regex,
| with a /m at the end, makes them match the start and end of
| any line.
| jwilk wrote:
| The code doesn't use /m. So indeed the problem is that $
| will match before \n at the end of the string _the Perl
| interpreter is working with_ , which is not the whole
| thing containing the payload.
| jodrellblank wrote:
| I've just worked through it to understand it; it's a more
| subtle exploit than I thought. My above comment about $
| being end-of-line is wrong, the above comment saying you
| need the /m modifier for that to happen is correct.
|
| In case anyone cares, the bug matches <backslash>
| <newline> <quote> then the regex in question matches
| <backslash> <newline> and the code logic is that one
| backslash must be escaping the next character from the
| input, but instead of adding on the next character _it
| always adds a quote_ - after searching for a quote the
| next character must be a quote, right? But it wasn 't, it
| was a newline, which was missed because of $ behaviour
| with newline at the end of a string. That acts to shift
| the quote one to the left <backslash> <quote> <newline>
| and now that makes an escaped quote which won't break
| eval() and the loop carries on and reads in the exploit
| text up to the real end quote.
|
| ----
|
| Details: the Perl code takes this pattern in the input
| (the quotes are part of the input, in the file data):
| "a\ ""
|
| six characters, describing a quoted string of four
| characters: <a> <backslash> <newline> <quote>
|
| The Perl code finds the string starting quote and moves
| past it, and sets $tok empty to hold the quoted string
| content. Then it searches for the next quote (not the
| last, the next): last Tok unless
| $$dataPt =~ /"/sg;
|
| This will match at the quote after the newline. Then it
| substrings from the saved opening quote position to 1
| before the found quote. So the substring includes the
| newline char: # the closing quote
| position. (not including the quote). $tok .=
| substr($$dataPt, $pos, pos($$dataPt)-1-$pos);
|
| That gets <a> <backslash> <newline> and not the following
| <quote> <quote>
|
| Then it does the odd-number-of-backslashes test on the
| substring <a> <backslash> <newline>:
| last unless $tok =~ /(\\+)$/ and length($1) & 0x01;
|
| And the regex matches for <backslash> <newline> instead
| of the intended <backslash> ENDOFSTRING so the code
| thinks there is an escaped character. It doesn't _add the
| next character from the string into the token_ , it
| _assumes the escaped character must be a quote_ and
| _always adds a quote_ to $tok. $tok .=
| '"'; # quote is part of the string
|
| Effectively shifting the input quote to the other side of
| the newline, from: "a\ ""
|
| to "a\" "
|
| and in the exploit case: "a\
| "exploit code"
|
| to "a\" exploit code"
|
| Then the inner loop runs again and finds the <exploit
| code to closing quote> text and adds that on to $tok. Now
| there's a string with an escaped quote and some exploit
| code.
| rixrax wrote:
| So how do I know if there is bug bounty available for
| vulnerabilities in exiftool? Or ghostscript? Or ffmpeg, openssl,
| gnutls, sox, or any number of other packages I may be using?
| IshKebab wrote:
| They're open source and written by volunteers so there almost
| certainly isn't. Unless it's covered by this:
|
| https://bughunters.google.com/about/rules/6521337925468160/g...
| richbell wrote:
| There's also https://www.hackerone.com/internet-bug-bounty
| andersa wrote:
| For open source libraries, you are usually better off finding a
| large company that uses it in some exposed way, then submitting
| it to their bug bounty.
|
| Sometimes you can even collect the bounty multiple times by
| sending it to multiple companies, so long as the first one
| doesn't submit the fix before the second even looks at the
| report...
| TheDong wrote:
| There is at least one bug bounty available for all open source
| projects.
|
| I run a bug bounty where if you tell me about the high level
| details of a bug and it sounds interesting, I will buy you a
| drink (up to $20, so that's the bounty) in exchange for you
| telling me all the details.
|
| All applications for my bug bounty program must be in person,
| feel free to take me up on it!
| voytec wrote:
| Note "2021"
| ptx wrote:
| This was fixed in version 12.24 (back in 2021) according to the
| version history page[0], but the current version still uses
| "eval" in several places[1]. This seems like an unnecessarily
| dangerous approach - wouldn't it have been a good idea to fix all
| the instances in the codebase after the bug was discovered?
|
| [0] https://exiftool.org/ancient_history.html#v12.24
|
| [1]
| https://github.com/search?q=repo%3Aexiftool%2Fexiftool+%2Fev...
| fulafel wrote:
| Indeed. Also I wonder if Gitlab is still using this.
|
| Gitlab self-hosted users who didn't patch got bitten with in
| the wild exploitation:
| https://www.rapid7.com/blog/post/2021/11/01/gitlab-unauthent...
| mdaniel wrote:
| Seems the answer is yes: https://gitlab.com/search?search=exi
| ftool&nav_source=navbar&...
|
| https://gitlab.com/gitlab-
| org/gitlab/-/blob/v16.8.1-ee/lib/g... and
| https://gitlab.com/gitlab-
| org/gitlab/-/blob/v16.8.1-ee/lib/g...
| amiga386 wrote:
| From TFA:
|
| > In Perl, eval can be used with a block to trap exceptions
| which is why it was being used everywhere.
|
| ...which your search doesn't seem to exclude (I'm not signing
| up to Github to find out so I'm just guessing from your URL
| that you didn't exclude "eval {" or "eval q{")
|
| It does still use eval on occasion and they seem happy with it.
|
| * for their own controlled expressions to customise base
| parsing behaviour (which could be refactored to use function
| references instead, but it's not a case of evaluating external
| data)
|
| * to test if modules can be imported (which you've excluded
| from your search)
|
| and in one case in CanonRaw.pm, using eval on an expression
| that matches [0-9./]+ , I'm not sure it looks at external data
| but it might, and at best you could cause a divide-by-zero
| error (e.g. 1/0) or syntax error (e.g. ///) if you mess with
| the data it parses.
|
| So overall, not much dangerous eval going on.
| fulafel wrote:
| Catching exceptions from eval-ing untrusted code doesn't
| strike me as being a big help from security POV, or is there
| something else special about the block form of eval that
| helps here?
| _flux wrote:
| Block form is not intended to run untrusted code, it is
| used similar to how exception handling is used in some
| languages: perl -e 'eval { die "aiee" };
| print "problem: $@"'
| mdaniel wrote:
| The non-GH search version: https://sourcegraph.com/search?q=c
| ontext:global+repo:%5Egith...
|
| Your observation is proper in that it did not exclude the
| patterns you mentioned, but in this specific case not
| necessary because all 4 results are shaped the exact same
| way: lib/Image/ExifTool/Exif.pm
| #### eval Start ($valuePtr, $val) my
| $newStart = eval($$subdir{Start});
| unless (Image::ExifTool::IsInt($newStart)) {
| #### eval Base ($start,$base)
| $subdirBase = eval($$subdir{Base}) + $base;
| } lib/Image/ExifTool/MakerNotes.pm
| #### eval Start ($valuePtr) $newStart =
| eval($$subdir{Start}); }
| #### eval Base ($start,$base) my
| $baseShift = eval($$subdir{Base}); #
| shift directory base (note: we may do this again below
| #### eval OffsetPt ($valuePtr)
| $ifdOffsetPos = eval($$subdir{OffsetPt}) - $dirStart;
| }
| ptx wrote:
| You're right that excluding "eval {" is not necessary
| because it only occurs in the already excluded subset, but
| that search link is missing a lot of matches since the
| space character in my pattern has turned into a plus sign
| in yours. Try this: https://sourcegraph.com/search?q=contex
| t%3Aglobal+repo%3A%5E...
|
| There are also some cases (excluded in this search) where a
| charset parameter is passed around and eventually passed to
| eval in the LoadCharset function, e.g. from the RTF parser
| through the Recompose method. Not sure if that's always
| safe.
| Jaxan wrote:
| Exiftool is one of those open source tools which provides a lot
| of features just for free. It's really amazing that people can
| work on something like this for so long.
| Levitating wrote:
| The "Investigation" box at Hack The Box includes this
| vulnerability.
| 1vuio0pswjnm7 wrote:
| I prefer Exiv2 over ExifTool. I can compile it statically. Does
| not require a Perl installation.
| lucb1e wrote:
| I would be more interested in a "what I like about it" sentence
| over a preemptively pacifying "to each their own" tbh. The
| tool's own description doesn't compare against exiftool either,
| besides that it sounds like it does only images and not, e.g.,
| documents
|
| > Exiv2 is a C++ library and a command-line utility to read,
| write, delete and modify Exif, IPTC, XMP and ICC image
| metadata.
___________________________________________________________________
(page generated 2024-01-27 23:00 UTC)