[HN Gopher] Zoom: Remote Code Execution with XMPP Stanza Smuggling
       ___________________________________________________________________
        
       Zoom: Remote Code Execution with XMPP Stanza Smuggling
        
       Author : Flowdalic
       Score  : 173 points
       Date   : 2022-05-24 15:00 UTC (8 hours ago)
        
 (HTM) web link (bugs.chromium.org)
 (TXT) w3m dump (bugs.chromium.org)
        
       | thinkmassive wrote:
       | Heh, it's like an AIM punter, but better!
        
       | jeffbee wrote:
       | At some point we are going to need enforceable professional
       | standards that effectively deal with commercial software
       | publishers who choose to parse untrusted inputs in non-
       | performance-sensitive contexts with C libraries.
        
         | TedDoesntTalk wrote:
         | We are? Why?
        
           | defen wrote:
           | Since most software users are not tech-savvy and care about
           | convenience and price significantly more than they care about
           | security (revealed preference), the "worse is better"
           | phenomenon incentivizes commercial developers to implement
           | the minimum security practices that their customers will
           | bear. This is individually rational for the developers and
           | the users, but the result is untold billions of dollars of
           | costs costs. Regulation would be one way to change the
           | incentives.
        
         | turminal wrote:
         | This bug has nothing to do with language choice.
         | 
         | I agree that better professional standards and accountability
         | should be introduced for software like zoom though.
        
         | userbinator wrote:
         | No. We don't need more authoritarian dystopia.
        
       | bobbylarrybobby wrote:
       | Having multiple, potentially different parsers is incredibly
       | dangerous. One person used the fact that different plist parsers
       | in the macOS kernel choked in different ways when interpreting
       | malformed xml, leading some to believe the plist was "safe"
       | because it did not grant certain permissions, while others
       | trusted this "safe" plist but believed it did grant these
       | permissions.
       | 
       | https://blog.siguza.net/psychicpaper/
        
       | dqv wrote:
       | I didn't even consider the existence of XMPP vulns until I
       | listened to the Darknet Diaries episode about Kik[0]. It's a
       | really interesting class of vulnerabilities.
       | 
       | [0]: https://darknetdiaries.com/episode/93/
        
       | dgellow wrote:
       | Some relevant info in case you don't want to read the whole
       | description but wonder if you're concerned by the issue:
       | 
       | > Zoom fixed the server-side issues in February and client-side
       | issues on April 24 in version 5.10.4.
       | 
       | > Zoom published a security bulletin about client-side fixes at
       | https://explore.zoom.us/en/trust/security/security-bulletin
       | 
       | CVE-2022-25235 CVE-2022-25236 Fixed-2022-Apr-24 CVE-2022-22784
       | CVE-2022-22785 CVE-2022-22786 CVE-2022-22787
        
       | kevincox wrote:
       | This is another lesson that you should always parse+serialize
       | rather that just validate. It is much harder to smuggle data this
       | way to exploit different parsers.
       | 
       | Basically the set of all messages that will satisfy your
       | validator is far larger than the set of all messages that will be
       | produced by your serializer.
        
         | lovasoa wrote:
         | I am not sure this applies in this case. I don't know how
         | Zoom's XMPP backend works, but it could very well parse and
         | serialize and still be vulnerable. If the xml library accepts
         | invalid 3-byte utf8 characters on parse, then its internal
         | representation supports these characters, and I don't see why
         | they would not be serialized just as well.
        
         | fsflover wrote:
         | Or, it's another lesson that you should not completely trust
         | any code but compartmentalize instead. Thanks to Qubes OS, I am
         | still safe, since Zoom is running in a hardware-virtualized VM.
        
           | JoshTriplett wrote:
           | I'm safe as well, because I only use the web version of Zoom.
           | Code you don't trust should always run in a sandbox, if it
           | runs at all.
        
             | fsflover wrote:
             | This is however a very different level of sandboxing.
        
               | JoshTriplett wrote:
               | Sure, but it's much easier for most people to run things
               | in a browser sandbox.
        
           | jeffbee wrote:
           | How is that helpful? This exploit completely replaces the
           | Zoom software with arbitrary attacker software and it
           | executes in your VM that has access to camera, microphone,
           | network, and presumably screen recording. It sounds to me
           | like the highest possible level of access and your VM is just
           | performative.
        
             | fsflover wrote:
             | 1. It will not have access to anything else than Zoom.
             | 
             | 2. It will not have access to the camera or network, when
             | I'm not using Zoom.
             | 
             | 3. If I'm using a disposable VM, it's cleaned every reboot.
             | 
             | > and presumably screen recording
             | 
             | Screen recording of this VM.
        
               | jeffbee wrote:
               | How is screen recording only of Zoom itself of any use to
               | you?
        
               | fsflover wrote:
               | If needed, I can move a presentation to that VM, or open
               | a browser in it.
               | 
               | It gets a bit complicated if you want to share a screen
               | from another VM, see https://forum.qubes-os.org/t/share-
               | screen-of-qube-with-anoth...
        
         | ifratric1 wrote:
         | XMPP servers (including Zoom's) already parse + serialize ;)
        
       | robertlagrant wrote:
       | This vuln writeup is extremely well written. Actually quite
       | interesting to read!
        
       | twoodfin wrote:
       | The XML parsing/validation bugs are, I suppose, not shocking, but
       | deeply disappointing.
       | 
       | The _one thing_ XML  & its tooling were supposed to get right was
       | document well-formed-ness. Sure, it might be a mess of a standard
       | in other ways, but at least we could agree what a parser should
       | and shouldn't accept! (Not the case for the HTML tag soup of then
       | or now.)
       | 
       | That, 25 years on, a popular XML processor can't even meet that
       | low bar for _tag names_ is maddening.
        
         | jerf wrote:
         | Unfortunately, the problem here is programmers moreso than
         | formats. It literally doesn't matter what you specify,
         | programmers will not implement it to a T. Most programmers
         | simply don't know that every single detail matters. Many of
         | those who may have some idea don't really care, since they
         | can't imagine how something like this could happen.
         | 
         | It's not just XML. It's every ecosystem I've ever used. Push it
         | around the edges and you _will_ find things.
         | 
         | This is neat, not because it is special to JSON in particular
         | but because it's an example of examining a good chunk of a
         | large ecosystem: https://seriot.ch/projects/parsing_json.html
         | Consider this is likely to be true in any ecosystem that
         | doesn't make it a top priority to avoid.
        
           | mwcampbell wrote:
           | I suppose it's safest to use a binary format where variable-
           | length fields are prefixed with their length.
        
             | amluto wrote:
             | More generally, if you want to include a block of
             | untrustworthy structured data in a protocol, it's very much
             | preferable to do so in a way that does not require
             | inspecting the data in question to figure out where it ends
             | and thus where the outer protocol resumes.
             | 
             | English is not immune. Think about "who's on first" --
             | there is no way to distinguish the untrustworthy name "who"
             | from a grammatical part of the conversation.
        
             | jandrese wrote:
             | Sure if you like ingesting 4GB records. There is nothing
             | inherently safer in binary formats. It's easy to write
             | parsers that can handle properly formatted files, it is
             | when you're dealing with corrupt or misformed files that
             | everything gets complicated.
        
               | teakettle42 wrote:
               | > There is nothing inherently safer in binary formats.
               | 
               | Sure there is. Barring a pathologically bad wire format
               | design, they're easier to parse than an equivalent human
               | editable encoding.
               | 
               | Eliminating the human-editing ability requirement also
               | enables us to:
               | 
               | - Avoid introducing character encoding -- a huge problem
               | space just on its own -- into the list of things that all
               | parsers must get right.
               | 
               | - Define non-malleable encodings; in other words, ensure
               | that there exists only one valid encoding for any valid
               | message, eliminating parser bugs that emerge around
               | handling (or not) multiple different ways to encode the
               | same thing.
        
             | jerf wrote:
             | Assuming properly-created data, yes. You aren't immune to
             | problems but you will reduce them, especially in a memory-
             | safe language.
             | 
             | Unfortunately, in a security context, that is not only not
             | guaranteed, but will be actively attacked, so in practice
             | I'm not sure it buys you _that_ much from a security
             | perspective. A net positive, I think, but certainly not
             | enough that you ca metaphorically kick back and enjoy your
             | lemonade.
             | 
             | The binary format is one of the oldest of security
             | vulnerabilities, by simply claiming a length of larger than
             | the buffer allocated in the C program, though I'm inclined
             | to credit that particular joy to C and not the data itself.
             | Nowadays there aren't many languages where simply claiming
             | to be really long will get you anywhere like that.
        
             | ajsnigrutin wrote:
             | Sure, until someone sets the prefix to 100MB large, and
             | sends zero bytes of data :)
        
           | IshKebab wrote:
           | I disagree. The way the format is designed has a direct
           | effect on how likely implementors are to implement it
           | correctly. So the format designers bear some responsibility.
           | 
           | For example how many Protobuf parser libraries have security
           | bugs? I'm guessing very few because the standard is nice and
           | simple, and it's very clearly defined without much "it's
           | probably like this" wiggle room (much easier for binary
           | formats!).
           | 
           | XML had a ton of unnecessary complexity that could have been
           | avoided to make implementations simpler. I haven't actually
           | read this bug so let's see if it was one of:
           | 
           | * Closing tags having to repeat the name / two different ways
           | of closing tags.
           | 
           | * CDATA
           | 
           | * Namespaces (especially how they are defined)
           | 
           | * &entities;
           | 
           | Edit: Ha it wasn't any of those - but it was still an issue
           | with text based formats. Seems like Expat assumes the content
           | is _valid_ UTF-8 (and doesn 't validate it), while Gloox
           | assumes it is ASCII. Obviously this couldn't have happened
           | with binary formats.
           | 
           | If you care about security DON'T USE TEXT FORMATS!
        
             | salawat wrote:
             | Wrong.
             | 
             | If you care about security, _verify your goddamn
             | invariants_.
             | 
             | This is not a software problem. This is a lazy
             | programmer/software engineer problem. Electrical
             | Engineering, or hell, any matyre engineering field
             | understands this concept.
             | 
             | If you have mot read your entire codepath, _you have no
             | idea what it is you are doing_.
             | 
             | Welcome to why my life as a QA is effing miserable. Every
             | bit of ignorance by devs following the philosophy of
             | "abstraction is good" is dealt with at the level of
             | Software BoM audit.
             | 
             | All hail Time to Market!
        
               | KronisLV wrote:
               | > If you care about security, verify your goddamn
               | invariants.
               | 
               | While it would be nice to be able to do this, sadly we
               | don't have infinite resources, lest we be okay with
               | actually shipping software in 5-10 years instead of 1-2.
               | I know that I would be okay with such a world, but people
               | who pay my salary might not share that point of view. Nor
               | do the people who would have to choose an app to use in
               | the near future, instead of waiting for a decade to do
               | so.
               | 
               | > This is not a software problem. This is a lazy
               | programmer/software engineer problem. Electrical
               | Engineering, or hell, any matyre engineering field
               | understands this concept.
               | 
               | The thing is, that the majority of the development out
               | there is like the Wild West. If my code throws a
               | NullPointerException or a NullReferenceException, then
               | someone is going to be mildly annoyed and it might result
               | in a Jira issue to fix. Code failing in a variety of ways
               | is almost considered normal in some respects, outside of
               | specific (expensive) contexts, where correctness matters
               | a lot.
               | 
               | Admittedly, even in programming there are fields where
               | the stakes are higher, though writing code for planes (as
               | an example) is wildly different than what 90% of people
               | out there would call "programming". Personally, I'd like
               | 100% test coverage (lines, code branches, everything),
               | but outside of these high stakes environments it would be
               | wasteful to do so.
               | 
               | > If you have mot read your entire codepath, you have no
               | idea what it is you are doing.
               | 
               | For many out there, this is pretty much impossible to do
               | in a meaningful way. Let's use something like the Spring
               | framework, a popular option in Java for web dev, a stack
               | that has a rather high level of abstraction. In it, the
               | actual code path that you're dealing with would involve
               | your application code, the framework code (which is
               | likely many times longer than your actual application,
               | uses reflection and other complex mechanisms, overall
               | being truly Eldritch at times), any integrated libraries,
               | as well as the JVM and some other code on your actual
               | system, that interfaces with the JVM.
               | 
               | Even if you toss out Java from the stack, the actual hot
               | code path in any non-trivial piece of software will be
               | pretty difficult to reason about, due to different types
               | of linking, different external package versions etc.
               | Unless you feel okay with very, very slowly stepping
               | through everything with a debugger, which probably still
               | won't give you too good of an idea of what's actually
               | happening and what should have happened.
               | 
               | Though maybe traversing 20 layers of abstraction in
               | Spring and coming out of that debugging session more
               | confused than you were than when you entered it is just a
               | Java/Spring thing, who knows.
               | 
               | > Welcome to why my life as a QA is effing miserable.
               | Every bit of ignorance by devs following the philosophy
               | of "abstraction is good" is dealt with at the level of
               | Software BoM audit.
               | 
               | I think there's plenty of misery to be had all around.
               | For a humorous take at the state of things, have a look
               | at this article:
               | https://www.stilldrinking.org/programming-sucks
               | 
               | > All hail Time to Market!
               | 
               | All hail being able to pay rent by delivering sub-optimal
               | software to meet ever changing business demands in an
               | environment where nobody wants to pay for perfect
               | software. That's simply the world we live in, take it or
               | leave it (e.g. pursue whichever environment feels better
               | to you, within the bounds of your opportunities in life).
        
           | twoodfin wrote:
           | This is just so basic a screwup though. The W3C spec for XML
           | has had a formal syntactic description of valid tag names for
           | decades:
           | 
           | https://www.w3.org/TR/2006/REC-xml11-20060816/#sec-common-
           | sy...
           | 
           | Plenty of libraries get this right because it's so easy.
           | You'd almost have to try--probably by being "clever"--to get
           | it wrong.
        
         | Diggsey wrote:
         | There are just so many issues here.
         | 
         | 1) Don't rely on two parsers having identical behaviour for
         | security. Yes parsers for the same format _should_ behave the
         | same, but bugs happen, so don 't design a system where small
         | differences result in such a catastrophic bug. If you
         | absolutely _have_ to do this, at least use the same parser on
         | both ends.
         | 
         | 2) Don't allow layering violations. All content of XML
         | documents is required to be valid in the configured character
         | encoding. That means layer 1 of your decoder should be
         | converting a byte stream into a character stream, and layers 2+
         | should not even have the opportunity to mess up decoding a
         | character. Efficiency is not a justification, because you can
         | use compile-time techniques to generate the exact same code as
         | if you combined all layers into one. This has the added benefit
         | that it removes edge-cases (if there is one place where bytes
         | are decoded into characters, then you _can 't_ get a bug where
         | that decoding is only broken in tag names, and so your test
         | coverage is automatically better).
         | 
         | 3) Don't transparently download and install stuff without user
         | interaction, regardless of where it comes from!
         | 
         | 4) Revoke certificates for old compromised versions of an
         | installer so that downgrade attacks are not possible.
        
           | iancarroll wrote:
           | > Revoke certificates for old compromised versions of an
           | installer so that downgrade attacks are not possible.
           | 
           | Worth noting that Windows accepts signatures from revoked
           | code signing certificates so long as it has a signed
           | timestamped before the revocation.
        
             | hamandcheese wrote:
             | ....and I assume the revocation can't be back-dated?
        
               | ComputerGuru wrote:
               | timestamps must come from a globally recognized signed
               | source, like digicert or verisign.
        
               | iancarroll wrote:
               | The CA could backdate the CRL's revocation timestamp if
               | they wanted, but it seems unlikely and presumably it's
               | not allowed.
        
           | bombcar wrote:
           | I doubt anyone actively revokes certificates ever - perhaps
           | maybe the game console makers.
        
             | crismigo wrote:
             | dsdas
        
       | henearkr wrote:
       | Good thing that I never used the standalone client and always the
       | in-browser webapp instead.
        
         | user23894295637 wrote:
         | How do you do that? On any OS I tried (Debian, Windows) it
         | always *forces* me to download the standalone client, otherwise
         | I can't join. There's no alternative link ("Join via web") like
         | MS Teams has for example.
         | 
         | I really feel uncomfortable each time I have to install the
         | client on a machine for my relatives :/
        
           | ydant wrote:
           | I've always been able to use the in-browser client, but you
           | have to download the client once or twice before the page
           | will update to show the alternative "use browser". It's
           | definitely an intentional dark pattern.
        
           | mehagar wrote:
           | Check out https://github.com/arkadiyt/zoom-redirector. You
           | can also join meetings from https://pwa.zoom.us/wc/.
        
             | user23894295637 wrote:
             | OMG, thank you so much! That's a huge relief.
             | 
             | I actually started boycotting Zoom meetings where I can. If
             | anyone sends me a zoom invitation and I know that they are
             | not forced by having to be available for larger audiences I
             | suggest them to use basically anything else.
             | 
             | I don't know why, but from the first time I visited their
             | website until today, I have the feeling I can't trust the
             | company.
        
       | Flowdalic wrote:
       | It appears that Gloox, a relative low-level XMPP-client C
       | library, rolled much of its Unicode and XML parsing itself, which
       | made such vulnerabilities more likely. There maybe good reasons
       | to not re-use existing modules and rely on external libraries,
       | especially if you target constraint low-end embedded devices, but
       | you should always be aware of the drawbacks. And the Zoom client
       | typically does not run on those.
        
         | Aeolun wrote:
         | I find that response a bit strange, since the whole reason the
         | Zoom client has these particular vulnerabilities is because
         | they didn't roll their own, and instead rely on layers of
         | broken libraries.
         | 
         | It's quite possible they'd have more bugs without doing that,
         | but re-using existing modules could just as easily have been an
         | even worse idea.
        
           | eli wrote:
           | I think the point is that Unicode and XML parsing are known
           | to be security critical components and you should take care
           | that they are handled only by well tested code designed
           | specifically for the purpose. You need to not roll your own
           | and also ensure that any third party components didn't roll
           | their own.
        
             | remus wrote:
             | > You need to not roll your own and also ensure that any
             | third party components didn't roll their own.
             | 
             | If you're not writing the code and somebody else isn't
             | writing the code then who is writing the code?!
        
               | eli wrote:
               | A well-tested Unicode library built for security should
               | be doing your Unicode parsing in security critical
               | components.
               | 
               | It's just another way of saying you should be doing a
               | security audit as part of selecting a library and
               | integrating it into your product.
        
           | WesolyKubeczek wrote:
           | Using what everyone and their dog is using is prone to bugs
           | just as much because software without bugs doesn't exist or
           | is not very useful, but it also has the benefit of many
           | versatile eyeballs looking at it in many different contexts.
           | 
           | So if there's a bug found and fixed in libxml2 which is used
           | by almost everything else, everyone else instantly benefits.
           | Same with libicu which is being used, for example, by NodeJS
           | with its huge deployments footprint. Oh, and every freakin'
           | Webkit-based browser out there.
           | 
           | OTOH, they rolled their own, so all bugs they hit are
           | confined only to zoom, and are only guaranteed to get Zoom
           | all the bad press.
           | 
           | Choose your poison carefully.
        
             | Aeolun wrote:
             | If they roll their own it also becomes less interesting to
             | actively exploit.
             | 
             | Obviously this doesn't really work for Zoom any more, since
             | their footprint is too large, but it can stop driveby
             | attackers in other situations. Nobody is going to expend
             | too much effort figuring out joe schmuck's homegrown
             | solution, where they'd happily run a known exploit against
             | the unpatched wordpress server.
        
               | pixl97 wrote:
               | Security by obscurity has been debated to hell and back.
               | It only works if you stay obsecure... and don't leak your
               | code.
        
           | Flowdalic wrote:
           | I get your confusion. But keep in mind that it is not only
           | about just picking the library that shows as first result of
           | your Google search. My naive self thinks that a million
           | dollar company should do some research and evaluate different
           | options when choosing external codebase to build their
           | flagship product on. There a dozens of XMPP libraries, and
           | they picked the one that does not seem to delegate XML and
           | Unicode handling to other libraries, which should raise a
           | flag.
        
           | mwcampbell wrote:
           | I think that's a false dichotomy; IMO the best default choice
           | is to rely on the most well-tested library in any given
           | category. That suggests to me that they should have used
           | expat on the client side.
        
         | zamalek wrote:
         | One of the harder things with XMPP is that it is a badly-formed
         | document up until the connection is closed. You need a SAX-
         | style/event-based parser to handle it. That makes rolling your
         | own understandable in _some_ cases (e.g. dotnet 's System.Xml
         | couldn't do this prior to XLinq).
         | 
         | That being said, as you indicated Gloox is C-based, and the
         | reference implementation of SAX is in C. There is no excuse.
        
           | Flowdalic wrote:
           | > One of the harder things with XMPP is that it is a badly-
           | formed document up until the connection is closed. You need a
           | SAX-style/event-based parser to handle it.
           | 
           | That is a common misconception, although I am not sure of its
           | origin. I know plenty of XMPP implementations that use an XML
           | pull parser.
        
             | zamalek wrote:
             | It's possible by blocking the thread that's reading the
             | XML, but now you're in thread-per-client territory, and
             | that doesn't scale.
        
           | TedDoesntTalk wrote:
           | DOM-based XML parsers use SAX parsing under the hood.
        
             | zamalek wrote:
             | Right, but if they don't give you access to the SAX parser
             | then you are SOL.
        
         | xxpor wrote:
         | This is a very common issue across all of software engineering
         | I've found. But I really don't get why. If I was given the task
         | of parsing Unicode or XML, I'd run and find a library as fast
         | as possible, because that sounds terrible and tedious, and I'd
         | rather do literally anything else!
         | 
         | Why aren't people more lazy, in other words?
        
       | rektide wrote:
       | How much of Zoom is powered by XMPP? Do we know much about these
       | internals? This would be super cool to learn about.
        
       ___________________________________________________________________
       (page generated 2022-05-24 23:00 UTC)