[HN Gopher] Adventures in Fuzzing Matrix's Encryption
___________________________________________________________________
Adventures in Fuzzing Matrix's Encryption
Author : Arathorn
Score : 64 points
Date : 2021-06-14 18:00 UTC (5 hours ago)
(HTM) web link (matrix.org)
(TXT) w3m dump (matrix.org)
| sohei wrote:
| Is there a Matrix design document at a similar level of
| abstraction to [1,2,3]?
|
| I want to use it, but the client (Riot? Element?) scares me with
| all its features, and I'm not sure I have a working mental model
| about what information is shared with whom and under what
| circumstances.
|
| Is there a "blessed" client that doesn't run in the browser?
| Written in Python or Java?
|
| [1] https://www.chromium.org/chromium-os/chromiumos-design-
| docs/... [2] https://www.chromium.org/chromium-os/chromiumos-
| design-docs/... [3] https://www.chromium.org/chromium-
| os/chromiumos-design-docs/...
| ognarb wrote:
| There is a spec[1] as well as many third parties clients[2]
|
| [1]: https://spec.matrix.org/unstable/
|
| [2]: https://matrix.org/clients/
| feanaro wrote:
| For a nicely documented and clean Matrix client
| implementation in Python, see weechat-matrix[1] and the
| matrix-nio[2] library it's based on. There's also Mirage[3]
| which is also based on matrix-nio and is a GUI client.
|
| [1]: https://github.com/poljar/weechat-matrix
|
| [2]: https://github.com/poljar/matrix-nio
|
| [3]: https://github.com/mirukana/mirage
| sharedfrog wrote:
| > Classic fuzzers famously have a hard time circumventing magic
| values and checksums, and cryptography is full of these. This is
| further complicated by the fact that the double ratchet algorithm
| is very stateful and depends on the two ratchets evolving in
| lockstep.
|
| Can the first issue be solved by scraping all the magic values
| from the codebase and putting them in the fuzzer's dictionary
| file? I wonder if this could even be automated when doing white-
| box fuzzing -- have the fuzzer scan the code when placing
| instrumentation and extract every "interesting" constant from
| e.g. `if` checks.
|
| Regarding the second issue and the double ratchet specifically,
| are there even ideas on how it could be approached for feasible
| fuzzing?
| infogulch wrote:
| Perhaps the solution is to extend the idea of the fuzzer
| dictionary to include not only byte string constants but also
| _functions_. Throw in magic values, yes, but also sha-3 and the
| main primitive for the ratchet algorithm, etc.
|
| That way the fuzzer doesn't have to reinvent hashing and
| encryption algorithms from scratch, which I expect would be as
| impossible as generating the preimage of a hash or outright
| breaking the encryption. Maybe those should be fuzzed too, but
| let's not swallow the whole cow at once when we just want to
| fuzz one protocol.
| TonyTrapp wrote:
| AFL++ already does that (it scans occurrences of strcmp,
| memcmp, etc. for magic values). The problem here might rather
| be that those magic constants are not expected to show up in
| the bit stream being read, but rather being the result of a
| calculation. If the result of a calculation must be 42 for a
| function to continue its execution, it's not so useful to have
| 42 in the fuzzer dictionary. If 42 is expected to be read from
| the file instead, it's easier and a dictionary will help.
| dkasak wrote:
| Author here. As one of the other comments mentions, afl++ (and
| to some extent vanilla afl) already has capability to
| automatically scrape magic values from arguments to special
| functions like `strcmp` and the like. The older technique is
| called libtokencap (https://github.com/AFLplusplus/AFLplusplus/
| blob/stable/utils...), but afl++ also has a newer feature
| called AUTODICT (https://github.com/AFLplusplus/AFLplusplus/blo
| b/stable/instr...).
|
| But this only solves the problem of magic _constants_ expected
| in the input. If the check depends on dynamic properties of the
| input or happens deeper in the code after the input 's already
| been through some transformations, it can't be solved like
| this. There are other techniques to help with this, though. One
| of the earlier attempts to solve such types of more complex
| checks is called laf-intel (https://github.com/AFLplusplus/AFLp
| lusplus/blob/stable/instr...) and boils down to transforming a
| more complex check into a nested series of simpler checks. This
| makes it more probable that the fuzzer's random mutation will
| be able to solve the outer check and hence hit new coverage,
| enabling the fuzzer to detect the mutation as productive.
|
| afl++ has a more modern variant of this called CmpLog (https://
| github.com/AFLplusplus/AFLplusplus/blob/stable/instr...) which
| is based on the RedQueen technique. The paper for RedQueen is a
| really interesting read: https://www.syssec.ruhr-uni-
| bochum.de/media/emma/veroeffentl...
|
| The problem of checksums is at times also solved by simply
| modifying the binary so that the checksum is neutralized and
| always succeeds, especially if you have access to source code.
|
| As for the problem of fuzzing stateful things like the double
| ratchet, one way of tackling the problem is to think of the
| input to the fuzzer as not only the raw bytes that you'll be
| passing to the program you're fuzzing, but as a blueprint
| specifying which high-level operations you'll be performing on
| the input. Then you teach your fuzzer to be smarter and be able
| to perform a bunch of those operations.
|
| So, let's say you take 512 bytes as the input to the fuzzer.
| You treat the first 256 bytes as the message to decode and the
| latter 256 bytes as the high-level cryptographic operations to
| perform on this message, each byte specifying one of those
| operations. So you could say a byte of value 1 represents the
| operation "ENCRYPT WITH KEY K1", 2 represents "ENCRYPT WITH KEY
| K2", 3 represents "DECRYPT WITH KEY K1", 4 represents "DECRYPT
| WITH KEY K2", 5 represents "PERFORM SHA2" and so on. Now you
| can feasibly end up with a sequence which will take a message
| encrypted with key K1, decrypt it, modify the message, then re-
| encrypt with key K2. Or, in the case of the double ratchet
| algorithm, have it perform multiple successive encryption steps
| to evolve the state of the ratchet and be able to fuzz more
| deeply.
|
| Of course, the encoding needs to be rather dense for this to
| work well so that ideally each low-level bit mutation the
| fuzzer does on an input still encodes a valid sequence of valid
| high-level operations.
___________________________________________________________________
(page generated 2021-06-14 23:00 UTC)