[HN Gopher] The Ken Thompson Hack
___________________________________________________________________
The Ken Thompson Hack
Author : stephc_int13
Score : 108 points
Date : 2024-03-30 12:41 UTC (10 hours ago)
(HTM) web link (wiki.c2.com)
(TXT) w3m dump (wiki.c2.com)
| legobmw99 wrote:
| Surprisingly, it took until last year for someone to actually ask
| Ken for the code: https://research.swtch.com/nih
| rolandog wrote:
| Thanks for the link. I had saved it for later reading, but
| didn't get around to doing it until now.
|
| It's such a well-written article.
|
| Here's previous HN discussion about it:
|
| https://news.ycombinator.com/item?id=38020792
| dang wrote:
| Thanks! Macroexpanded:
|
| _Running the "Reflections on Trusting Trust" Compiler_ -
| https://news.ycombinator.com/item?id=38020792 - Oct 2023 (67
| comments)
|
| (posted by rsc himself!)
| stephc_int13 wrote:
| There is something nightmarish about this kind of exploit, and
| that is maybe why we've been collectively in denial for such a
| long time.
|
| How many supply-chain generated backdoors in the wild today?
| kibwen wrote:
| Supply-chain backdoors are one thing, but the saving grace
| about the backdoors injected by trusting-trust attacks like in
| the OP is that they're quite fragile. If the shape of the
| expected target program changes sufficiently, then the backdoor
| will fail to propagate. And propagating the backdoor into
| programs that have yet to be written requires an effectively
| omnisicient adversary. As long as you have multiple
| implementations of a toolchain, you can repeatedly use them to
| compile each other in every possible configuration (and then
| use the outputs of those compilation steps as inputs to the
| same process, repeatedly) and then compare that the outputs are
| bit-for-bit identical until you reach a point where you're
| assured that either none of the compilers are backdoored or all
| of them would have to be.
|
| So yes, they're scary, but there are countermeasures.
| legobmw99 wrote:
| One of my favorite short stories plays on the natural horror of
| this realization:
| https://www.teamten.com/lawrence/writings/coding-machines/
| Veserv wrote:
| I mean, this class of attack is defeated as a matter of course
| in high reliability applications like aerospace. You just check
| the compiler output precisely corresponds to the compiler
| input. You need to do that anyways to prevent miscompilation
| errors in general (this class of attack is just intentional
| miscompilation), so it hardly counts as a nightmarish problem,
| just a annoying one.
| azornathogron wrote:
| How is this normally done? Manual review of machine code?
| Veserv wrote:
| I do not work on verification directly, so I do not know
| the details. I believe at a high level you generally have a
| automated first pass which generates a source<->machine
| trace. You then have multiple independent reviews
| validating the correctness and exhaustiveness of the trace.
|
| Generally in these environments you build with
| optimizations off unless needed to ease this validation
| process as, obviously, you must validate the
| released/deployed version.
| kwhitefoot wrote:
| > You just check the compiler output precisely corresponds to
| the compiler input.
|
| _just_ is doing a lot of work there.
| Veserv wrote:
| What do you want me to say? That is the work that you must
| do and that is routinely done. It is annoying, but it is
| not hard or special. Hardly a nightmarish problem if it
| gets solved routinely at scale across a entire industry.
| ludocode wrote:
| The team at bootstrappable.org have been working very hard at
| creating compilers that can bootstrap from scratch to prevent
| this kind of attack (the "trusting trust" attack is another name
| for it.) They've gotten to the point where they can bootstrap in
| freestanding so they don't need to trust any OS binaries anymore
| (see builder-hex0.)
|
| I've spent a lot of my spare time the past year or so working on
| my own attempt at a portable bootstrappable compiler. It's partly
| to prevent this attack, and also partly so that future
| archaeologists can easily bootstrap C even if their computer
| architectures can't run any binaries from the present day.
|
| https://github.com/ludocode/onramp
|
| It's nowhere near done but I'm starting a new job soon so I felt
| like I needed to publish what I have. It does at least bootstrap
| from handwritten x86_64 machine code up to a compiler for most of
| C89, and I'm working on the final stage that will hopefully be
| able to compile TinyCC and other similar C compilers soon.
| pfortuny wrote:
| Impressive work and truly necessary! Thanks for sharing.
| toolslive wrote:
| really nice, and impressive work. However, I'm left wondering
| if a route via Forth would not have been a lot shorter.
| necheffa wrote:
| So bootstrap in freestanding does make this kind of attack much
| more difficult to pull off, but with contemporary hardware, it
| does not fully prevent the attack.
|
| What if the trojan is in microcode? No amount of bootstrap in
| freestanding can protect you here.
| ludocode wrote:
| It is true that there are many layers of code below the OS
| level. UEFI for example is probably hundreds of thousands of
| lines of compiled code. Modern processors have Intel IME and
| equivalent with their own secret firmware. Almost all modern
| peripherals will have microcontrollers with their own
| compiled code.
|
| These are all genuine attack vectors but they are not really
| solvable from the software side. At least for Onramp I
| consider these problems to be out of scope. It may be
| possible to solve these with open hardware but a solution
| will look very different from the kind of software
| bootstrapping we're doing.
| hinkley wrote:
| Correct me if I'm wrong, but isn't this recreating a thing that
| used to exist? I have memories of being told of a compiler
| older than GCC that could compile itself using... I want to say
| a bash script. It took forever to run because you had to run
| the script which of course was slow, and then it output a
| completely unoptimized compiler. And if memory serves that
| output didn't have any of the optimization logic in it. So you
| had to compile it again to get the optimizer passes to be
| compiled in, then compile it again to get a fast compiler (self
| optimization).
| markbnj wrote:
| Clicked on the first Bell Labs link hoping to read the original
| and ended up on some VPN service's landing page. Sigh. :(
| cratermoon wrote:
| I preferred when link rot led to a 404 or site not found. I
| have recently been working with some stuff I wrote ~2004 and so
| many links are now dead. I've been giving archive.org a real
| workout.
| bhelyer wrote:
| A live link:
| http://users.ece.cmu.edu/~ganger/712.fall02/papers/p761-thom...
| cratermoon wrote:
| It looks like the xz hack riffs on this idea in getting the
| payload into the final binary, and attempting to hide itself.
| raspyberr wrote:
| >"infinite spinner"
|
| >"This site uses features not available in older browsers."
|
| >Enable Javascript.
|
| Black plaintext on white background with hyperlinks.
| Revolutionary.
| anthk wrote:
| https://kidneybone.com/c2/wiki/TheKenThompsonHack
| anthk wrote:
| Javascript-less link:
|
| https://kidneybone.com/c2/wiki/TheKenThompsonHack
| dwheeler wrote:
| For more, including information about the "Diverse Double-
| Compiling" (DDC) countermeasure, see my page:
| https://dwheeler.com/trusting-trust/
| legobmw99 wrote:
| Question from reading the abstract: how does one acquire this
| "second (trusted) compiler" when you are genuinely suspicious
| you are facing an adversary using an attack like this one?
| nunez wrote:
| From the ACM paper [0]:
|
| > Acknowledgment. I first read of the possibility of such a
| Trojan horse in an Air Force critique [4] of the security of an
| early implementation of Multics. I cannot find a more specific
| reference to this document. I would appreciate it if anyone who
| can supply this reference would let me know.
|
| ofc the US Gov was behind this. Incredible.
|
| [0]: https://dl.acm.org/doi/pdf/10.1145/358198.358210
| jimhefferon wrote:
| I'm not sure it is fair to claim that the Gov is behind the
| white paper, in that paying for a study is different than a
| program to develop malicious code.
| kkfx wrote:
| A small side note: the website look like a classic web website,
| BUT if you run a privacy attentive WebVM [1] you get at first
|
| [1] the monsters normally known as "browsers", witch happen to be
| something like `less`, `more`, `most` and so on, not much like a
| JVM "javascript required to view this site" while perfectly
| classic looking and modern looking websites works very well
| without js, if you enable js but keep third party contents
| restricted you get a "page does not exist"...
|
| A critics just to say "please, if your audience is some tech
| savvy cohort, try to design things for them.
| ellis0n wrote:
| What's the potential percentage of malware on 100GB
| (100,000,000,000 bytes) in your system if a backdoor occupies 100
| bytes?
| kbenson wrote:
| 100GB of _what_?
|
| Text documents? Word documents? CSV files? Excel files? Cat
| videos? Porn videos? Source code from a trusted source? Source
| code from randos? Games or apps installed from a marketplace?
| Games or apps installed as downloads directly from random
| websites?
|
| It's like asking what's the chance you'll die from a fall if
| you live 80 years. The chance is hard to know, but it's
| probably more likely if you're a free solo rock climber than if
| you aren't.
| ellis0n wrote:
| Sure modern Mac/Windows/Linux system files with browser
| cache. Code+data files after build. This is enough to never
| find out about those 100 bytes.
|
| Not to mention hardware backdoors in chips. Have all
| trillions of transistors and their schematics in each chip
| version been checked for possible backdoors in the Verilog
| compiler? One extra connection can provide root access and
| cost millions of dollars.
| fsflover wrote:
| Not all data on your PC are equally dangerous. The key is to
| keep Trusted Computing Base as small as possible. See: Qubes
| OS.
| ellis0n wrote:
| QubesOS is amazing! I've been working on QubesOS for several
| years. When it was just starting out, it was a fairly simple
| solution, but now it's a big software, and it has just as
| many problems as any Linux distribution. Complexity is the
| only parameter by which you can determine the amount of
| malware. For 0 bytes of useful code, there will be 0
| backdoors. I also developed ACPU OS for this reason 12 years
| ago, but later realized that no one would pay for simplicity.
| anthk wrote:
| GNU Mes tries to avoid this by bootstrapping _everything_.
| acchow wrote:
| I actually thought his Turing award was for this attack. Turns
| out it was "just" his acceptance speech!
___________________________________________________________________
(page generated 2024-03-30 23:02 UTC)