[HN Gopher] LibLISA - Instruction Discovery and Analysis on x86-64
       ___________________________________________________________________
        
       LibLISA - Instruction Discovery and Analysis on x86-64
        
       Author : Luc
       Score  : 78 points
       Date   : 2024-10-24 07:35 UTC (1 days ago)
        
 (HTM) web link (liblisa.nl)
 (TXT) w3m dump (liblisa.nl)
        
       | jf wrote:
       | I've long wanted to have a way to see what actually happens
       | inside a CPU when a set of instructions are executed. I'm pretty
       | excited after skimming this paper as it looks like they developed
       | a technique to automatically determine how the x86-64
       | instructions actually work by observing real world CPU behavior.
        
         | dzaima wrote:
         | This is determining the directly observable behavior rather
         | than anything about microarchitectural specifics.
        
           | ddingus wrote:
           | And still quite a useful tool, particularly when exploring
           | undocumented instructions.
        
         | fragmede wrote:
         | blinkenlights might be up your alley, if you haven't seen it
         | before.
         | 
         | https://justine.lol/blinkenlights/
        
           | rbanffy wrote:
           | Every computer deserves a blinking lights front panel.
        
         | westurner wrote:
         | From https://news.ycombinator.com/item?id=33563857 :
         | 
         | > Memory Debugger
         | 
         | Valgrind > Memcheck, None:
         | https://en.wikipedia.org/wiki/Valgrind ; TIL Memcheck's `none`
         | provides a traceback where the shell would normally just print
         | "Segmentation fault"
         | 
         | DynamoRio > Dr Memory:
         | https://en.wikipedia.org/wiki/DynamoRIO#Dr._Memory
         | 
         | Intel Pin: https://en.wikipedia.org/wiki/Pin_(computer_program)
         | 
         | https://news.ycombinator.com/item?id=22095435, : SoftICE, EPT,
         | Hypervisor, HyperDbg, PulseDbg, BugChecker, pyvmidbg (libVMI +
         | GDB), libVMI Python, volatilityfoundation/volatility,
         | Google/rekall -> yara, winpmem, Microsoft/linpmem, AVML,
         | 
         | rr, Ghidra Trace Format:
         | https://github.com/NationalSecurityAgency/ghidra/discussions...
         | https://github.com/NationalSecurityAgency/ghidra/discussions...
         | : appliepie, orange_slice, _cannoli_
         | 
         | GDB can help with register introspection:
         | https://web.stanford.edu/class/archive/cs/cs107/cs107.1202/l...
         | :
         | 
         | > _Auto-display and Printing Registers: The `info reg` command_
         | [and `info all-registers` (`i r a`)]
         | 
         | emu86 implements X86 instructions in Python, optionally in
         | Jupyter notebooks; still w/o X86S, SIMD, AVX-512, x86-84-v4
        
           | westurner wrote:
           | > _Valgrind `none` provides a traceback where the shell would
           | normally just print "Segmentation fault"_
           | 
           | What would it take to get `bash` to print a --traceback after
           | "Segmentation fault\n", and then possibly also --gdb like
           | pytest --pdb?
           | 
           | - [ ] ENH: bash: add valgrind `none`-style ~ --traceback
           | after "Segmentation fault" given an env var or by default?
        
       | saagarjha wrote:
       | This is neat but the analysis of their work leaves a bit to be
       | desired. You can't just randomly select instructions and see if
       | you did a good job, because the instruction space is not really
       | uniform on any axis that people care about. For example, on a
       | hypothetical ISA that has most the encoding space that is, like,
       | simple arithmetic ops then you can get "good" coverage really
       | easily. But that's not actually very useful because the
       | instructions people care about when doing this kind of analysis
       | are specific and usually more esoteric, and difficult to analyze
       | with a simple bitstring approximation. Like, this definitely
       | cannot discover the semantics of syscall, or rdrand. The authors
       | claim they would have been able to discover reptar if they
       | extended their work slightly, but I think it is pretty dubious
       | that their methodology is powerful enough to do so.
        
       | pabs3 wrote:
       | Reminds me of sandsifter, a fuzzer for the x86 ISA:
       | 
       | https://github.com/xoreaxeaxeax/sandsifter
        
         | amy-petrik-214 wrote:
         | great piece on application of such -
         | https://media.defcon.org/DEF%20CON%2026/DEF%20CON%2026%20pre...
        
       | jxors wrote:
       | Hi! I'm one of the authors. Cool to see our work show up on HN!
       | 
       | I'm happy to answer questions if there are any.
        
       | specialgoodness wrote:
       | This is interesting work but it totally misses the boat when it
       | talks about the current state of the art. They cite a 2014
       | version of the Goel-Hunt-et al formal x86 model in ACL2, but they
       | fail to talk about its modern version. The modern version
       | (developed at Centaur and then Intel!) is an ACL2 x86 model that
       | is so precise that it can boot Linux and run user-land programs.
       | Let me say that again: it is a formal mathematical model of a
       | processor that is so precise that it can boot Linux and run user-
       | land programs! This is a monumental accomplishment and is not
       | even mentioned in their paper.
        
       ___________________________________________________________________
       (page generated 2024-10-25 23:01 UTC)