* Flesh out the child ops some more - Add more things that a real program would do. - add all the ops things like fsx do. - do file ops on a bunch of trinity test files - open->read->close - open->mmap->access mem->close - sysctl flipper. - pick random elevator alg for all queues - fork-and-dirty mappings - Ability to mark some ops as 'NEEDS_ROOT'. - Move the drop privs code from main to just before we start a new child. * maps.c improvements: - Sometimes generate overlapping addresses/lengths when we have ARG_ADDRESS/ARG_ADDRESS2 pairs - make sure ARG_ADDRESS only uses addresses from this list, and audit all other mmap/malloc uses in sanitise routines. - munge lengths when handing them out. - mmap files (we do this already, but don't track it properly) - get_map_fragment() - mprotect parts of a map will need to somehow track what pages are RO/RW etc - keep track of holes when munmap'd split maps in two ? (store original len, and current len) * munge_process() on child startup - replace fork() with random clone() - run children in different namespaces, personalities. - unshare * ioctl improvements - needs filename globbing for some ioctls - Sanitise routines for more ioctls - ext4 - Maybe just make the ioctl's be NEED_ROOT child ops * Some debugging enhancements. - Make -D use a separate debug log file - if we have a large number of children, we use up a lot of fd's for the log files. Instead of keeping them all open, reopen them as needed. * postmortem improvements - change child->syscall / ->previous to be a ringbuffer of syscallrecord structs. - Compare timestamp that taint was noticed at, ignore all later. * Do taint watching in the child loop too. * --dry-run mode. need to work around segv's when we do things like mmap->post and register null maps. * Rewrite the fd code. - kill off NR_FILE_FDS - open some files in the child too after forking. - this requires a child-local fd mapping table. Maybe we can then reduce the size of the shared shm->file_fds - When requesting an fd, occasionally generate a new one. - Could we do the nftw walks in parallel ? That would speed up startup a lot. Though would need to pass list back up to main thread somehow. - support for multiple victim file parameters - When picking a random path, instead of treating the pool of paths as one thing, treat it as multiple (/dev, /sys, /proc). And then do a 1-in-3 chance of getting one of those. Right now, because there are 5-6 digits worth of /proc & /sys, they dominate over the /dev entries. - more fd 'types' (fanotify_init) * Change regeneration code. - Instead of every n syscalls, make it happen after 15 minutes (but with a minimum of n syscalls) * Pretty-print improvements. - decode fd number -> filename in output - decode addresses when printing them out to print 'page_rand+4' instead of a hex address. - ->decode_argN functions to print decoded flags etc. - decode maps. * Watchdog improvements - in main loop, check watchdog is still alive - RT watchdog task ? * filename related issues. - filename cache. Similar to the socketcache. Create on startup, then on loading, invalidate entries that aren't present (/proc/pid etc). This should improve reproducability in some cases. Right now, when a syscall says "open the 5231st filename in the list", it differs across runs because we're rebuilding the list, and the system state means stuff moves around. - Add a way to add a filter to filelist. ie, something like --filter-files=*[0-9]* to ignore all files with numbers in them. Maybe also make this a config file parser. - Dump filelist to a logfile. (Perhaps this ties in with the idea above to cache the filelist?) - blacklist filenames for victim path & /proc/self/exe - make sure we don't call unlink() or rmdir() on them - also need to watch /proc/$$/exe, look up using shm->pids. - file list struct extensions - use count * Networking improvements. - Rewrite socket generation. Organise into (sorted) per-protocol buckets of linked-lists.. - Search buckets for dupes before adding. - for syscalls that take a fd and a sockaddr, look up the triplet and match. - Flesh out sockaddr/socket gen for all remaining protocols. - setsockopt on network sockets when we regenerate Disabled right now, because it causes some socket types to hang. - specify an ip of a victim machine (Maybe also config file) * Improve the ->post routine to walk a list of objects that we allocated during a syscalls ->sanitise in a ->post method. - On return from the syscall, we don't call the destructor immediately. We pick a small random number, and do N other syscalls before we do the destruction. This requires us to create a list of work to be done later, along with a pointer to any allocated data. - some ancillary data needs to be destroyed immediately after the syscall (it's pointless keeping around mangled pathnames for eg). For this, we just destroy it in ->post - Right now ->sanitise routines have to pick either a map, or malloc itself and do the right thing to free it in ->post. By tagging what the allocation type was in generic-sanitise, we can do multiple types. * Perform some checks on return from syscall - check padding between struct members is zeroed. * Output errno distribution on exit * allow for multiple -G's (after there is more than 'vm') * audit which syscalls never succeed, and write sanitise routines for them * if a read() blocks and we get a kill from the watchdog, blacklist (close?) that fd/filename. * Some of the syscalls marked AVOID are done so for good reason. - Revisit fuzzing ptrace. - It's disabled currently because of situations like.. child a traces child b child a segfaults child b never proceeds, and doesn't get untraced. * Further syscall annotation improvements - Finish annotating syscall return types * structured logging. - To begin with, in parallel with existing text based logging. - Basic premise is that we store records of each syscall in a manner that would allow easier replay of logs. - For eg, if a param is an fd, we store the type (socket/file/etc..) as well as a pathname/socket triplet/whatever to create it. - Eventually, kill off the text based logging, and replace it with ./trinity --parselog=mylog.bin - Done correctly, this should allow automated bisecting of replays. - Different replay strategies: - replay log in reverse order - brute force replay using 1 call at a time from beginning of log + last syscall. (possibly unnecessary if the above strategies are good enough) - Once bisected, have a tool that can parse the binary log, and generate C. - Would need a separate binary logfile for each child. Because locking on a shared file would slow things down, and effectively single thread things, unless the children pass things to a separate logger thread, which has its own problems like potentially losing the last N syscalls if we crash) - To begin with, just allow replay/bisect using one child process. Synchronising threads across different runs may be complicated. * Misc cleanups - Move arch specific syscalls into syscalls/arch/ - Move addresses in get_interesting_value() to a function in per-arch headers. * watch dmesg buffer for interesting kernel messages and halt if necessary. Lockdep for eg. - Pause on oops. Sometimes we might want to read trinity state when we trigger a bad event. * Blocked child improvements. - if we find a blocking fd, check if it's a socket, and shutdown() it. (tricky: we need to do the shutdown in the main process, and then tell other children) * make -p take an arg for seconds .