[HN Gopher] Show HN: Tracexec - TUI for tracing execve and pre-e...
___________________________________________________________________
Show HN: Tracexec - TUI for tracing execve and pre-exec behavior
tracexec helps you to figure out what and how programs get executed
when you execute a command. It's useful for debugging build
systems, catching fd leaks, understanding what shell scripts
actually do, figuring out what programs does a proprietary software
run, etc.
Author : kxxt
Score : 37 points
Date : 2024-05-08 12:04 UTC (10 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| _flux wrote:
| Pretty nice! I recall having used strace for this, and this is
| certainly nicer for that use case. I think I used it to determine
| how a certain file was built. Somehow it was easier to do it that
| way than understand the build system :).
|
| What I did though was generate dot-files for graphviz to process.
| To that graph it's easy to add information about opened/created
| files as well.
| _flux wrote:
| Found the script, so here it is for posterity!
| #!/bin/sh if [ -z "$1" ]; then echo "usage
| (in zsh): 'strace -f program -o logi; okular =(dot -Tpdf =($0
| logi))'" exit 0 fi LOG=$1
| echo 'digraph {' echo 'rankdir="LR";' sed
| 's/^\([0-9]*\) \(<... \)\?\(clone\|vfork\).*= \([0-9]*\)/\1 ->
| \4;/; t; d' "$LOG" sed '/ENOENT/ d; s/^\([0-9]*\)
| execve("[^"]*\/\([^"]*\)".*/\1 [label="\1: \2"];/; t; d' "$LOG"
| | sort | uniq sed '/ENOENT/ d; s/^\([0-9]*\)
| openat([^,]*, "\([^"]*\)", .*\(O_WRONLY\|O_RDWR\).*/\1 -> "\2";
| "\2" [shape=rectangle];/; t; d' "$LOG" | sort | uniq
| echo '}'
|
| I _imagine_ it makes a graph of executed programs along the
| files they created, perhaps it even works in some cases.
| nh2 wrote:
| Ah yes, very good!
|
| More such tools are needed that make common strace tasks more
| convenient.
|
| strace is the ultimate debugging tool to dive into almost every
| problem where a program doesn't do what you want, but its stream
| based nature and lack of scripting doesn't make all tasks
| convenient. Most importantly, you can't really "program" with it.
|
| Some time ago, I started implementing "strace as a library" in
| Haskell, to get an easy-to-process stream of syscalls/signals as
| an ADT, so that one can easily build features on top of it, e.g.
| a tree of fork()ed/execve()'d child processes. That's point
| "special run modes tailored to specific tasks (e.g. execve tree)"
| from https://github.com/nh2/hatrace. I haven't gotten as far as
| I'd like yet, because I'd like to first explicitly support all
| Linux syscalls, and build the features on top of that afterwards.
| And some features are not so easy, e.g. reading memory of the
| traced process (which strace does to do e.g. show what data was
| `read()`).
|
| From that experience, I applaud anybody who makes good use of
| ptrace() -- that man page is not the easiest to read!
| nh2 wrote:
| After playing with the software a bit:
|
| The view that diffs environment variables of a process with the
| ones from the parent process is very nice.
|
| Could you add a feature that makes it possible to search/filter
| the "Environment" panel, maybe even the events panel? Often one
| wants to figure out how a certain argument, environment
| variable, or assigned value of each of those finds its way into
| a process. Being able to filter that quickly would be great!
| nh2 wrote:
| I packaged your program for Nix [1].
|
| If possible, have a look at the build, or even better, become
| co-maintainer if you can!
|
| If anybody wants to try it, on any Linux distro with `nix`
| installed: NIX_PATH=nixpkgs=https://github.co
| m/nh2/nixpkgs/archive/e5943da391879c12ed8d6477a06805c8de4b27da.
| tar.gz nix-shell -p tracexec
|
| This will drop you into a shell where `tracexec` is installed.
|
| [1]: https://github.com/NixOS/nixpkgs/pull/310158
| abathur wrote:
| We have some common cause / need in the Nix/nixpkgs ecosystem,
| where finding missing package dependencies (whether they're bare
| invocations or absolute paths) is a bit of an ongoing problem.
|
| The golden path stuff isn't too hard to find during packaging,
| but there's a long tail of stuff that only shows up when you hold
| something exactly right.
|
| We can (with some effort) handle a decent fraction of posix/bash
| shell statically with https://github.com/abathur/resholve. I
| think a similar approach could generalize to most interpreted
| languages (but I haven't gotten around to it myself and I think
| it's broadly rarer beyond Shell).
|
| I have taken one or two steps into a lower-fidelity approach
| (https://github.com/abathur/binlore) that uses YARA rules to scan
| for signs of ~exec in both binary formats and interpreted
| languages. For now this is mostly just a top-N formats thing, but
| there's no reason we can't keep going. One of the big advantages
| is that it's fairly cross-platform, even if accuracy is
| suboptimal.
|
| The scanning approach isn't just amazing _for resholve 's
| specific needs_. I'm stepping down this path to try to identify
| executables that likely exec their command-line arguments--
| because this is a sign that resholve need to know how to check
| invocations of that command for other executables. But it's a
| good bit better at flagging things that have _any_ exec behavior,
| and that's a decent-enough first step for letting focus on a
| smaller set.
|
| Once we find these, though, we're generally left with human
| analysis of the help, manpage, and/or source to figure out what
| exec the scanner is hitting on and whether it's part of the CLI.
|
| I've been ~snoozing on this front for a while, but I feel like a
| decent fraction of the components for building something like an
| 80% toolkit exist in at least a primitive form. It could probably
| benefit from some more static source analysis bits. I also
| vaguely intend to look into whether LLMs can do a reasonably-good
| job of making it easier to answer these questions from the
| help/man/source. Unless something out in that space can be
| reliably automated, it also needs some way to ~pin assertions
| about the arguments to the relevant code in a way that they'll
| break if that specific corner of the codebase changes. (I've
| wondered if semgrep expressions could handle this.)
|
| There are also some other bits around that might fit in, though
| cross-platform is a common-ish problem. We've looked a little at
| using a FUSE, but that is hard to operationalize on macOS. I also
| poked at 3 or so less-interactive dynamic approaches a few years
| back in https://github.com/abathur/commandeer/ and I'm sure a few
| more have cut their teeth since then. I also wrote a ~shelved-
| for-now WIP (https://github.com/abathur/faffer) for using some
| cursed shell metaprogramming to do something like fuzz/tree-shake
| a shell script for loose dependencies. That isn't directly viable
| for many languages, but its focus on ~overloading flow-control
| structures to make it easier for force branch coverage does seem
| like something that might be feasible with rewriting tools.
| ranger_danger wrote:
| Does this use eBPF? Can it be expanded to work with other
| syscalls besides exec? I find myself using not only execsnoop.bt
| but also opensnoop.bt, statsnoop.bt, tcpconnect.bt etc.
___________________________________________________________________
(page generated 2024-05-08 23:01 UTC)