[HN Gopher] Toolong: Terminal application to view, tail, merge, ...
       ___________________________________________________________________
        
       Toolong: Terminal application to view, tail, merge, and search log
       files
        
       Author : ingve
       Score  : 253 points
       Date   : 2024-02-09 17:27 UTC (2 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | asmosoinio wrote:
       | Looks very interesting and should solve a pretty common use case
       | for me - am often trying to debug some issue over many log files.
       | I will for sure test this next week.
        
       | alserio wrote:
       | Nice! I've found this kind of tools really useful and love the
       | merge functionality. I've skimmed the README but maybe I've
       | missed the info: does toolong support multiline logs like
       | stacktraces? Or is it possible to customize the recognized
       | formats?
        
         | willm wrote:
         | It should work fine with multi-line logs.
         | 
         | Its relatively easy to add new formats in code, but there isn't
         | yet a way of configuring that. In the future I might add a
         | config file to make that easy.
        
           | alserio wrote:
           | Thank you!
        
       | avtar wrote:
       | The code base seems like a good reference as a small Python
       | project.
       | 
       | My fav option in this class of apps: https://lnav.org/ It lets
       | you use journalctl with pipes as requested here:
       | https://github.com/Textualize/toolong/issues/4
        
         | maxyurk wrote:
         | lnav is great
        
         | SushiHippie wrote:
         | Mind elaborating why this would be a good reference? Not saying
         | it isn't, just want to understand why you think so
        
           | avtar wrote:
           | Hi there! I'm mostly AFK for a couple days so replies might
           | be delayed, but some notes:
           | 
           | - System tool niche that I find interesting so it's nice to
           | see a fairly complete and yet relatively small project
           | 
           | - Textual seems to be their largest dependency, but other
           | than that the project seems self-contained and relies on the
           | standard library
           | 
           | - With some exceptions, the majority of methods are short and
           | easy to parse
           | 
           | - Typing support :)
        
             | SushiHippie wrote:
             | Makes total sense, I think I was a bit thrown of after the
             | first glance because there are so many classes and files
             | [0] and it reads a bit like Java code.
             | 
             | But after a second glance it looks very well written
             | compared to many other python projects, which sometimes
             | read like a 5000 line bash script.
             | 
             | And I can't argue against your arguments, especially using
             | "minimal" dependencies and using typing.
             | 
             | Typing often helps for autocompletion and understanding
             | what a variable/function "means", which makes it [1] easier
             | to start hacking on it.
             | 
             | [0] not necessarily bad, just wasn't what I would expect to
             | be a small reference project
             | 
             | [1] not always, sometimes types can be too verbose and
             | start messing with your brain ;)
        
         | sandebert wrote:
         | Plus it's possible to download lnav as a statically linked
         | binary, which is very nice. (Would have been even better if it
         | was in the official repos.) I'm not interested in installing
         | things using yet another package manager, like pip or the like.
        
         | gregopet wrote:
         | I donated to lnav, it's just soo good!
        
       | hawski wrote:
       | What I often do when I analyze logs is removing timestamps and
       | changing unique identifiers to something more predictable and
       | diff them to see when things started diverting from the norm,
       | because it is often earlier than the usual error/crash. Is there
       | anything that does something like it?
        
         | xuhu wrote:
         | https://github.com/harjoc/LogDiff - unmaintained, parses only
         | ProcMon, but does the steps you describe, and uses Kdiff3 for
         | diffs. With threaded apps it can be hard to identify
         | automatically which thread is which.
        
         | 1kurac wrote:
         | This looks good: https://github.com/kernc/diff-logs
        
       | jonnycoder wrote:
       | This would have been great when I used to work in embedded
       | development and had to grep log files to root cause bugs. I had
       | thought about writing something like this but was always too
       | busy. The biggest feature is combining log files and organizing
       | by timestamp.. well done
        
       | raldi wrote:
       | A handy utility I've written a few times is a tool that can
       | quickly extract a range of logs from a timestamped, ordered file
       | without reading every byte. This would be a good feature to add
       | to this.
        
       | piterrro wrote:
       | Congrats on the launch! I'm an author of https://logdy.dev seems
       | like we had a similar problem with logs and decided to solve it
       | but in a slightly different way. Logdy works with pipes very
       | well, I'm wrapping up another version to be released soon.
        
         | kinow wrote:
         | I maintain a workflow manager, and had both textualize and
         | logdy on my list of projects to try soon. Planning to add a TUI
         | written in textualize, and was thinking in something like logdy
         | (or use it directly). Not sure which way to go now, will play
         | with both now and see which version users like best. Thanks for
         | logdy, and thanks to the creators of toolong too!
        
       | mstudio wrote:
       | This looks great. I spend a good amount of time each week
       | grepping through Kubernetes log files. Looking forward to trying
       | this out next week. I particularly like the pretty-print and
       | merge options.
        
       | willm wrote:
       | Hello. Author of Toolong here. Happy to answer any questions!
        
         | mstudio wrote:
         | Just wanted to say thanks for creating this. Wonderful tool!
        
           | willm wrote:
           | De Nada
        
         | barbs wrote:
         | What's the relevance of the kookaburra?
        
           | willm wrote:
           | We have a tradition of using a bird logo for our projects.
        
       | nickjj wrote:
       | I'd love to see a tool that lets you modify large files
       | efficiently.
       | 
       | I had to replace line 4 of a 200 GB SQL dump, it took a
       | substantial amount of compute time to perform a find / replace
       | with sed and it also required having over double the disk space
       | since sed creates a temp file before it writes out the new file.
       | 
       | Using a hex editor could have worked but it seemed too risky
       | because data integrity was really important.
        
         | coolThingsFirst wrote:
         | What's the challenge in here? Loading the text efficiently in
         | RAM?
        
           | nickjj wrote:
           | It's mainly to save time.
           | 
           | I don't remember exactly how long it took but I remember it
           | being something like 30 minutes of purely waiting for the
           | find / replace to finish on a 4 CPU core / 8 GB of memory
           | machine. Memory wasn't an issue fortunately.
        
             | filleduchaos wrote:
             | I mean, if you knew you _only_ wanted to find /replace on
             | line 4, why not simply stop the search after the first
             | match?
             | 
             | The syntax escapes me at the moment, but I'm quite sure
             | I've used sed (or maybe awk?) in the past to do exactly
             | that
        
               | PetahNZ wrote:
               | Sed takes ages to remove a single line from huge files
               | too.
        
               | nickjj wrote:
               | As far as I know sed will still read and write the full
               | file, even when you do a modification on a specific line.
               | I've target deleted 1 line with sed and it took quite
               | some time too.
        
               | filleduchaos wrote:
               | There's a big difference between "quite some time" and 30
               | minutes, though. Of course sed needs to read the file and
               | write it back to disk, but that's capped by I/O speed
               | which is very high on modern drives - we're talking
               | _seconds_ for a file in the tens of gigabytes on the
               | fastest SSDs, a very far cry from half an entire hour.
        
               | nickjj wrote:
               | This was run on an AWS gp3 SSD EBS volume with 3k IOPS.
               | It was a CPU optimized c6i.xlarge machine. The file in
               | question was 200 GB.
               | 
               | I used "quite some time" because this happened months ago
               | and I don't recall the exact time on the delete. I don't
               | think deleting a line was any faster than doing a find /
               | replace on 1 line. If sed is reading and writing the file
               | in both cases I'd expect both to have the same
               | performance.
        
             | coolThingsFirst wrote:
             | Why would there be a market for something like this?
        
               | nickjj wrote:
               | Once in a while things come up where you need to make a
               | surgical edit in a really large file and it could be in a
               | scenario where time matters.
               | 
               | For example if you're doing a SQL dump -> import,
               | technically you could have downtime during this process
               | to eliminate any chance of data loss. Having to wait
               | ~30min for a command to finish is painful.
               | 
               | I'm not saying to go off and make this product to sell
               | but if such a tool existed and you positioned it at $39
               | or whatever, if it could eliminate half an hour of
               | downtime for a business it pays for itself. Especially if
               | the alternative is to muck around with hex editing the
               | file.
               | 
               | If 100,000 people have this problem and 3,000 of them
               | would pay for it that's $117,000. If you spent 6 months
               | making a super polished tool that made it easy to know
               | exactly what's being edit (or deleting lines, etc.)
               | that's pretty appealing. Even if the sales were half
               | that's still solid but it could also be 5x too, who
               | knows. Maybe you can finish such an app in 3 months
               | instead of 6, etc.. It's also an app that mostly feels
               | like it could reach a "done" state with little
               | maintenance since you're editing text files which is a
               | well known topic.
        
         | SushiHippie wrote:
         | Wouldn't something like this, work for your specific use case:
         | head -n4 sqldump > sqldump2
         | 
         | Then change the line 4 in sqldump2 to whatever you wanted to,
         | and after that:                 tail -n+5 sqldump >> sqldump2
         | 
         | And now sqldump2 contains sqldump but with line 4 edited.
         | 
         | This still requires double the disk space, but at least
         | shouldn't take 30 minutes.
        
           | nickjj wrote:
           | That would work but what would tail do different than sed
           | here? Both would still need to read and write the full file
           | (minus 4 lines for tail) which without testing indicates the
           | time would be the same.
        
             | SushiHippie wrote:
             | I thought that sed may have a "high" overhead, and that's
             | why it took so long.
             | 
             | YMMV, but I just reproduced this on my machine, and the
             | tail command took 3 minutes. Maybe you are limited by disk
             | IO?
             | 
             | What was the sed command you used?
             | 
             | EDIT: sed was even a bit faster with:                 sed
             | -i '4s/^.*$/your line four/' sqldump
             | 
             | Took 20s less than my tail solution, and does not occupy
             | duplicated disk space.
        
               | nickjj wrote:
               | I deleted the line with `sed -i "4d" sqldump`. The
               | duplicated disk space comes from sed writing a temporary
               | file out to the current directory, it does this in chunks
               | as the command runs.
               | 
               | It makes sense for sed to write a temp file even with
               | `-i` because what happens if your power goes out mid-way
               | through the command without the temp file? You'd have
               | data loss. To combat that sed will write a temp file out
               | and when the operation is completed the last step is to
               | move the temp file to the original file to overwrite it.
               | You might not notice this unless you monitor the
               | directory for changes while sed runs. You can spam `ls
               | -la` as sed runs on a decently sized file to see the temp
               | file.
        
         | verticalscaler wrote:
         | There is this thing called dd.
         | 
         | Hold it like so:                 seek=$bytes-offset-until-
         | line-4 conv=notrunc,sync
         | 
         | Enjoy responsibly :)
        
       | xwowsersx wrote:
       | Nice example of a README done well. What, why, screenshots, how.
        
       | HellsMaddy wrote:
       | This looks cool. If I have a log file with a format that is like
       | `[timestamp] SEVERITY { json content }`, can I use the feature
       | that pretty prints the JSON part, or does the whole line need to
       | be valid JSON? If not could I somehow write a plugin that would
       | allow me to parse the lines in order to accomplish this? That
       | would be really useful.
        
         | lathiat wrote:
         | lnav can do this may be worth a look
        
         | willm wrote:
         | Not currently, but it would be relatively easy to add.
         | 
         | You might want to open an issue on the repo.
        
       | BOOSTERHIDROGEN wrote:
       | In Indonesian "tolong" means help :)
        
         | marshblocker wrote:
         | Similar with Tagalog -- "tulong".
        
       | yalok wrote:
       | Is there a way to search/highlight multiple tokens at once?
       | 
       | Didn't find this in the help/GitHub readme...
        
       | keithnz wrote:
       | for a gui based tool, I find klogg really good.
        
       | swah wrote:
       | How to remember that I installed this in 2 weeks when I need it?
        
         | bemusedthrow75 wrote:
         | A physical post-it note on your monitor that you don't take off
         | until you've used it three times.
         | 
         | Or some other similar reminder -- perhaps just a list of things
         | you're going to install when you need them, so you don't
         | install them until you have a use.
         | 
         | (I have this problem too)
        
         | safeimp wrote:
         | When I have a similar conundrum I typically add it to my
         | calendar at a random time as a helpful popup/reminder.
         | 
         | Usually I don't even need the reminder, just writing it down is
         | reminder enough.
        
         | tester457 wrote:
         | Make this alias for whatever log viewer you use now.
         | alias tail='echo "use toolong instead"
         | 
         | Now using tail to view logs will print that reminder instead.
        
       ___________________________________________________________________
       (page generated 2024-02-11 23:02 UTC)