[HN Gopher] CVE-2021-32471 - Input validation in Marvin Minsky 1...
       ___________________________________________________________________
        
       CVE-2021-32471 - Input validation in Marvin Minsky 1967 Turing
       Machine
        
       Author : wsces
       Score  : 131 points
       Date   : 2021-05-10 11:18 UTC (11 hours ago)
        
 (HTM) web link (cve.mitre.org)
 (TXT) w3m dump (cve.mitre.org)
        
       | segfaultbuserr wrote:
       | A better link is the research paper:
       | 
       | https://arxiv.org/abs/2105.02124
       | 
       | > The universal Turing machine is generally considered to be the
       | simplest, most abstract model of a computer. This paper reports
       | on the discovery of an accidental arbitrary code execution
       | vulnerability in Marvin Minsky's 1967 implementation of the
       | universal Turing machine. By submitting crafted data, the machine
       | may be coerced into executing user-provided code. The article
       | presents the discovered vulnerability in detail and discusses its
       | potential implications. To the best of our knowledge, an
       | arbitrary code execution vulnerability has not previously been
       | reported for such a simple system.
       | 
       | > A common strategy for understanding a problem is to reduce it
       | to its minimal form. In the field of computer security,we may ask
       | the question: "What is the simplest system exploitable to
       | arbitrary code execution?" In this article, we pro-pose an answer
       | to that question by reporting on the discovery that a well-
       | established implementation of the universal Turing machine is
       | vulnerable to a both unintentional and non-trivial form of
       | arbitrary code execution.
        
         | Igorvelky wrote:
         | LIES
         | 
         | > A common strategy for understanding a problem is to reduce it
         | to its minimal form. In the field of computer security,we may
         | ask the question: "What is the simplest system exploitable to
         | arbitrary code execution?"
         | 
         | means that i will remove all security abstractions which were
         | generated in past 40 years and i find that it is more
         | vulnerable
         | 
         | this is not research, this is plain and simple trolling, BS
         | feeding
        
         | OskarS wrote:
         | Haven't looked into this paper deeply, but this reads very
         | strange to me:
         | 
         | > This paper reports on the discovery of an accidental
         | arbitrary code execution vulnerability in Marvin Minsky's 1967
         | implementation of the universal Turing machine. By submitting
         | crafted data, the machine may be coerced into executing user-
         | provided code.
         | 
         | It's a universal Turing machine. Its whole purpose is running
         | "user-provided code". That's what a Universal Turing machine
         | does, it runs arbitrary Turing machines. This is a little bit
         | like saying "we found a weakness in the Python interpreter
         | whereby you can feed it specially crafted input that allows you
         | to run arbitrary Python programs". Like... yeah... that's what
         | it's supposed to do.
        
           | tromp wrote:
           | If you do read the paper, you see that the universal machine
           | U expects both a description of the simulated machine T, and
           | a description of the input to T. The exploit is that they can
           | cause arbitrary code execution not by changing the former but
           | the latter. They also suggest a fix to U that makes it robust
           | against exploits in the description of T's input.
        
           | qsort wrote:
           | Yes, that's the joke.
        
             | UncleMeat wrote:
             | No it isn't. The paper forces the universal turing machine
             | to execute a _different_ machine than the one on the input
             | tape by providing crafted input to the intended simulated
             | machine. It isn 't a joke paper.
             | 
             | U is fed machine T and input I. By _only changing I_ you
             | can force U to execute a different machine X.
        
               | qsort wrote:
               | Sure, but you achieve that by specifying I in a way you
               | aren't really supposed to. You are effectively invoking
               | undefined behavior.
               | 
               | The implications of this are well understood. You'd have
               | similar vulnerabilities if you ran raw machine code on,
               | say, an x86 processor. Enforcing checks and sanitization
               | in such a way that the 'exploit' can't happen is the
               | exact kind of job you'd have to do in order to write an
               | operating system.
        
             | mycall wrote:
             | Wait until they try unicode, then the joke will be on them.
        
               | AreYouSirius wrote:
               | reddit is saying that macos is build on turing complete
               | emojis !
               | 
               | we should investigate XD
        
               | jradd wrote:
               | I cannot fathom how complicated arbitrary code execution
               | could get with multi byte characters that could use shift
               | registers, null bytes and byte order marks with
               | determinism in a NOP slide on a heap spray.
               | 
               | Filtering only printable user input helps but even bit
               | map images can expose a heap to a sensitive registers
               | that will execute some target specific generated shell
               | code.
               | 
               | https://en.m.wikipedia.org/wiki/NOP_slide.
               | 
               | https://en.m.wikipedia.org/wiki/Heap_spraying
        
             | OskarS wrote:
             | I get that the CVE is a joke, but are you saying the paper
             | is as well?
        
               | AreYouSirius wrote:
               | i dont know if he says exactly that by i do say exactly
               | that
               | 
               | paper is essentially saying that by using simplest
               | machine possible
               | 
               | we prove it does not provide layers of security
               | abstractions which were invented and deployed in last 40+
               | years to computers
               | 
               | so thats like getting CVE for turning off ASLR, AppARMOR
               | SElinux etc in linux kernel.
        
               | qsort wrote:
               | As far as I can tell, yes, it is.
               | 
               | Showing a universal Turing machine is a proof of the fact
               | that when you write stuff like the smn theorem or
               | anything that uses universal functions you aren't writing
               | nonsense.
               | 
               | There is no "contract" between the universal machine and
               | the hypothetical user. One could certainly ensure that,
               | say, the input TM is properly formatted, or that "the
               | runtime" will only use a certain set of cells, and so on.
               | The difficulties and problems one would find in trying to
               | do so are the same one would face when writing an
               | operating system and a compiler.
        
               | toxik wrote:
               | It's not a joke, people saying otherwise did not
               | understand the paper. The significance is debatable, it's
               | mostly a kind of collective facepalm for theoretical
               | computer scientists.
               | 
               | The CVE is a joke in the sense that you don't really need
               | a CVE for a universal Turing machine, as they are quite
               | literally only academic.
        
           | segfaultbuserr wrote:
           | > _This is a little bit like saying "we found a weakness in
           | the Python interpreter whereby you can feed it specially
           | crafted input that allows you to run arbitrary Python
           | programs"._
           | 
           | If my understanding is correct, it's more like, "we found a
           | weakness in the Python interpreter whereby you can feed it a
           | specially crafted input to a Python script that allows you
           | override that script and forces the Python interpreter to do
           | something else", which can be a reasonable point to make.
           | 
           | BTW, if we keep using the Python analogy, this scenario
           | sounds like a Python script that loads its configuration data
           | from a Pickle file on the hard drive. In Python, Pickle
           | doesn't validate the input and it's capable of modifying the
           | internal state of the program. What the authors say is that
           | malicious data in the Pickle file leads to code execution due
           | to deserialization of untrusted data. Depending on your
           | perspective, just like Python says a malicious Pickle file is
           | not in its threat model, you can argue that the Universal
           | Turing Machine is _never_ meant to be protected from
           | malicious data on the tape, and this is not a exploit, but an
           | interesting thought experiment nevertheless.
           | 
           | The paper says,
           | 
           | > _The universal machine, U, will be given just the necessary
           | materials: a description, on its tape, of T and of [the
           | initial configuration on T 's own, simulated tape] s_x; some
           | working space; and the built-in capacity to interpret
           | correctly the rules of operation as given in the description
           | of T. Its behavior will be very simple. U will simulate the
           | behavior of T one step at a time [...]_
           | 
           | > _[...] There is one obvious trust boundary in a universal
           | Turing machine, U: the initial string on the tape of the
           | simulated Turing machine, T. That string corresponds to the
           | user-provided data of an ordinary computer program. Because
           | the potential users may be unknown to the developers and
           | administrators of the computer and its programs, it is common
           | to view this data as untrusted. In our explorations of the
           | universal Turing machine, we will make the same assumption.
           | Therefore, if it were possible to execute arbitrary code
           | without manipulating the program of T, but only by providing
           | crafted data on T's simulated tape, that would constitute a
           | vulnerability._
           | 
           | Basically the authors' reasoning is:
           | 
           | 1. An Universal Turing machine is an interpreter/simulation
           | that is capable of executing code written for any Turing
           | machine, provided by the user.
           | 
           | 2. The user code takes an external input - the initial
           | content on the tape.
           | 
           | 3. It's possible to maliciously craft an input (the content
           | on the tape) to the user code to hijack the simulation,
           | without modifying the user code.
        
             | OskarS wrote:
             | Ah, that makes a ton of sense. Thanks!
        
       | Igorvelky wrote:
       | in my opinion this is recipe for DDoS attack on mitre
       | infrastructructure
        
       | al2o3cr wrote:
       | Meh. The "exploit" seems to rely on passing input that contains
       | cells with values that are valid for the universal machine's tape
       | alphabet, but not the simulated machine's.
       | 
       | It's as if you could pass a "null terminator" in a Unicode string
       | that didn't match the "normal" null character.
        
       | zitterbewegung wrote:
       | Sure this CVE sounds like a joke but if you create a programming
       | language that is non Turing complete it is much easier to secure
       | than a Turing complete language.
       | 
       | Making a language that have the expressive power of finite state
       | machines could be an example.
        
         | tester34 wrote:
         | What does Turing completness even mean in practice when it
         | comes to non-theoretical languages?
        
           | zitterbewegung wrote:
           | You can't do anything like recursion.
        
             | klyrs wrote:
             | I spent a while trying to parse this, and it doesn't make a
             | lick of sense.
             | 
             | Brainfuck is a classical example of a turing complete
             | language that doesn't have a notion of functions. What does
             | "recursion" mean here? Nothing, I think. So you don't need
             | recursion for Turing completeness.
             | 
             | Some people work on languages that support functions as
             | first class objects, which don't natively support
             | recursion. This is nice from a language design standpoint,
             | because functions only see the scope of their arguments and
             | can't look any further. The "Y Combinator" pattern is used
             | to bootstrap recursion into these languages. You may have
             | heard of it.
             | 
             | Turing-incomplete languages can also support unlimited
             | recursion. For example, take a Turing complete language,
             | and equip it with a state of the art theorem prover. Before
             | entering a loop, or calling a function, the theorem prover
             | is allowed to spend up to 1 minute proving that the loop or
             | function call will terminate. If the theorem prover times
             | out, the program halts with an error. Some functions, such
             | as a naive implementation of factorial, can be easily
             | proved to halt for all inputs. So this language allows
             | recursion, but by construction, all programs terminate, so
             | it's turing incomplete.
        
         | qsort wrote:
         | We are doing this already.
         | 
         | - Configuration languages (JSON, YAML, XML) are pure
         | combinatorial logic.
         | 
         | - Regular expressions are... mostly not actually regular, but
         | you get the idea.
         | 
         | - Some templating languages are deliberately less powerful than
         | Turing machines, e.g. ST4 is context-free.
         | 
         | - Prepared SQL statements are a similar idea on a different
         | axis.
         | 
         | The real question is whether a non-TC language could be useful
         | for general purpose programming. Such a language might come
         | with very strong guarantees (termination, time complexity, even
         | correctness or a limited form of correctness), but they might
         | be extra-cumbersome for 'normal' workloads.
        
           | ludamad wrote:
           | A non-TC language can be achieved just by forcing proofs. Oh,
           | you want that program to run? Just prove that it is bounded
           | memory and can't halt, etc
        
             | qsort wrote:
             | Yeah, but that's the 'cumbersome' part. Could we imagine a
             | non-TC language with simple syntax that's suitable for at
             | least some of the task you'd use a TC language for?
        
               | UncleMeat wrote:
               | A useful subset of SQL is not turing-complete and is
               | suitable for meaningful computational tasks. Datalog is
               | also not turing-complete and is similarly useful.
        
               | qsort wrote:
               | Sure, more examples are CSP solvers and AMPL. They are
               | all somewhat specific though. I was wondering if it would
               | be possible to make one that's "general purpose" in the
               | same sense you call programming languages "general
               | purpose", i.e. one you'd write application software in.
        
               | UncleMeat wrote:
               | Datalog would be the closest I can think of. A bunch of
               | static analysis engines are implemented in it.
        
               | magmastonealex wrote:
               | C compiled down to eBPF (as implemented with the Linux
               | kernel) is not turing-complete. eBPF has a wide variety
               | of uses from network filtering to hooking into certain
               | syscalls.
               | 
               | The bytecode is theoretically turing-complete, but the
               | verifier present in the kernel ensures that the code
               | contains no loops and accesses no out-of-bounds memory.
               | 
               | With `clang`, you can compile C down to eBPF bytecode.
               | It's a subset of C, with a lot of restrictions, but it is
               | familiar and capable of some remarkable things past just
               | simple packet parsing & filtering.
        
               | qsort wrote:
               | I had no idea, this is fascinating. Thanks for sharing.
        
       | AreYouSirius wrote:
       | But why to make CVE out of that ?
       | 
       | this paper, this CVE has nothing to do with ! USEFUL ! research,
       | 
       | paper is essentially saying that by using simplest machine
       | possible
       | 
       | we prove it does not provide layers of security abstractions
       | which were invented and deployed in last 40+ years to machines
       | 
       | so thats like getting CVE for turning off ASLR, AppARMOR SElinux
       | etc in linux kernel.
       | 
       | this proof is so obvious that highschooler should have to know
       | this.
       | 
       | If research alone does not get credited or recognized why we need
       | to award CVE's to make it more credible ? WHICH IS NOT EITHER WAY
       | 
       | second thing
       | 
       | do we as industry want VANITY CVE's ?
       | 
       | do we really need CVEs stemming from such a absurd exmples as
       | some old vacuum tube computer does not provide ASLR ? no sane
       | person wants that
       | 
       | and what if someone gets credited with 2 CVE's for heartbleed,
       | does such a person want to work in industry which awards CVE to
       | person exploiting bugs in 80' APPLE drives, IN 2021? which i know
       | of atlest 50 bugs in them. do i go and request CVE's for that ?
        
       | SilasX wrote:
       | @dang, this one should be merged here or vice versa:
       | 
       | https://news.ycombinator.com/item?id=27104125
        
       | userbinator wrote:
       | I think this should be considered a satire of what the "security
       | industry" has become: finding any little thing that they can
       | claim is exploitable, regardless of actual significance, and all
       | the while using paranoia to slowly destroy general-purpose
       | computing and user freedom.
        
         | chaganated wrote:
         | best comment, last comment
        
       | avibhu wrote:
       | > NOTE: the discoverer states "this vulnerability has no real-
       | world implications."
       | 
       | Not sure if declaring this is standard practice, but I had a good
       | laugh.
        
       | yabones wrote:
       | > NOTE: the discoverer states "this vulnerability has no real-
       | world implications."
       | 
       | At least they're honest about it with this CVE...
        
         | stevekemp wrote:
         | Issues like this, in obsolete code, are a lot of fun. Even if
         | they are essentially meaningless.
         | 
         | I reported CVE-2014-3423 back in the day, relating to GNU Emacs
         | using a predictable filename when talking to the Mosiac
         | browser. No choice, as that was what the browser required,
         | something that wouldn't exist these days.
        
         | buitreVirtual wrote:
         | I'm not so sure. Some banks and airlines might still be running
         | those machines.
        
       ___________________________________________________________________
       (page generated 2021-05-10 23:02 UTC)