[HN Gopher] Show HN: Hck - a fast and flexible cut-like tool
___________________________________________________________________
Show HN: Hck - a fast and flexible cut-like tool
Author : totalperspectiv
Score : 101 points
Date : 2021-07-10 15:46 UTC (7 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| queuebert wrote:
| Yay, no more piping multiple cuts when you have multiple
| delimiters.
| lillesvin wrote:
| I wrote something similar (but necet really finished it), called
| 'gut', in Go a few years back. Funny thing is, that I literally
| never use it. I thought splitting on regexes and that stuff would
| be super useful, but it turns out that I just use Perl one-liners
| instead. And Perl is available on something like 99.99% of all
| *nix machines, which my own 'cut'-substitute isn't.
|
| Still a good exercise for me to write it, and I assume for OP
| too.
| mongol wrote:
| A book "Minimal Perl" used to be referred to often in these
| discussions but I never hear about it any more. It was teaching
| these kind of tricks for command line magic.
| c54 wrote:
| I've never used perl, but i love concise bash 1-liner wizard
| incantations. What are some examples of things it's handy for?
| atsaloli wrote:
| See https://catonmat.net/perl-one-liners-explained-part-one
|
| And https://nostarch.com/perloneliners
| totalperspectiv wrote:
| It was indeed an great exercise! Part of the motivation for me
| was also performance oriented. I should add some Perl one
| liners to the benchmarks to see where they land as well. My
| experience is that they are usually a bit slower than awk.
| FractalHQ wrote:
| What tool would you recommend to someone who is starting out
| and wants to learn to write nifty scripts this day in age? I'm
| currently studying bash but there are so many scripting
| languages that I hear about and it's hard to know what to
| invest time into.
| andrewzah wrote:
| For a lot of tasks, posix-compliant Bash scripts are more
| than adequate. Use Perl, Python, or Ruby (your choice) if it
| becomes more complex (especially with state). It's worth
| considering ones that are installed by default on most linux
| distros.
|
| There's no reason to chase X script/lang of the month. Bash
| etc are extremely well documented and there's a very good
| chance someone already asked how to do something similar to
| what you're doing on stackoverflow, etc.
| fragmede wrote:
| Invest time into what you need to get your job done. Easy
| when summarized like that, but lets dig in.
|
| First consider what systems you want your skills to be
| applicable for.
|
| Do you need tools that work on many random Linux machines
| that you have little control over? Then go with the lowest
| common denominator - bash, and various command line tools
| (sed,awk,grep) included with every system, and get good with
| the subset of command line options common on all of them -
| most likely limited by the oldest system you need to work
| with. (There are still Windows XP and Redhat 4 systems out in
| the wild, if you're unlucky enough to have to work with
| them.)
|
| Do you need to work with OS X at all? I never learned to use
| Apple's outdated versions of programs, instead I heavily
| customized my laptop to have compatible versions of things
| but this only works because there's 1 os x machine I ever
| deal with.
|
| Then it's about the right tool for the right job. Do you want
| to process text? Awk will take you a _long_ way, but
| ultimately, Perl is your friend. Do you want to want more
| structured programming type things (aka objects /classes)?
| Then Python is your friend. There's a certain mindset that
| thinks that if everything is in one language things are
| better, but that's a trap. With enough work, you can do the
| same thing in any language, but each languages is better than
| others at some specific thing. (working legacy code is that
| something that a language can be better at than others.)
|
| These days, it's more important to learn what tools are
| available and how to use them, but because you can just
| google 'awk print second to last column' and plug that into
| your script, and continue working, there's less of a need to
| truely grok awk's language (for example). (I mean, spend the
| time to learn it once so it will come back to you the next
| time you need to do something more custom with it)
| JulianWasTaken wrote:
| > instead I heavily customized my laptop to have compatible
| versions of things but this only works because there's 1 os
| x machine I ever deal with.
|
| This is all good advice, but to be fair, "heavily
| customized" these days is nearly: brew
| install awk coreutils findutils gnu-tar gnu-sed gnu-which
| gnu-time
| toastal wrote:
| Heck
| valbaca wrote:
| > hck is a shortening of hack, a rougher form of cut.
| bilalhusain wrote:
| It is interesting to note how it compares to "choose" (also in
| Rust) in the benchmarks.
|
| single character hck 1.494 +- 0.026s
| hck (no-mmap) 1.735 +- 0.004s choose 4.597 +-
| 0.016s
|
| multi character hck 2.127 +- 0.004s
| hck (no-mmap) 2.467 +- 0.012s choose 3.266 +-
| 0.011s
|
| The single pass optimization trick[1] seems to be helping a lot
| in single character case.
|
| Of course, doing away with a pass is suppossed to give 2x, and I
| am wondering whether the regex constraint lead to this "side-
| effect".
|
| [1] fast mode -
| https://github.com/sstadick/hck/blob/master/src/lib/core.rs#...
| https://github.com/sstadick/hck/blob/master/src/lib/core.rs#...
| visarga wrote:
| <offtopic> I have implemented a `_split` command to split a line
| by a separator and `_stat` command that does basically `sort |
| uniq -c | sort -nr` counting elements and sorting by frequency.
| Really useful operations for me.
|
| When my one liners become 2-3 lines long I need to switch to a
| regular script, but I also log all my shell commands years back
| and have something a bit better than `history | grep word` to
| search it.</>
| rashil2000 wrote:
| Love seeing these modern alternatives to coreutils! Ripgrep, fd,
| hyperfine, bat, exa, bottom, gdu, wc, sd, hexyl...
|
| Yet to find a GNU 'tr' alternative though
| sieste wrote:
| > Ripgrep, fd, hyperfine, bat, exa, bottom, gdu, wc, sd,
| hexyl...
|
| Thanks for that list! Is there any place where more of these
| "modern alternatives to coreutils" are collected?
| basetensucks wrote:
| https://github.com/ibraheemdev/modern-unix is a pretty decent
| list.
| tyingq wrote:
| Here's tr in Perl:
| https://metacpan.org/dist/PerlPowerTools/source/bin/tr
| kristopolous wrote:
| What would you like it to do?
| rashil2000 wrote:
| It's not like anyone absolutely needs it, I was just
| fascinated by the recent surge in faster and more cross-
| platform utilities.
| kitd wrote:
| Nice work!
|
| I don't know whether anyone here has used Rexx. The 'parse'
| instruction in Rexx was incredibly powerful, breaking up text by
| field/position/delimiter and assigning to variables all in one
| line.
|
| I've often wondered if there was a command-line equivalent. Awk
| is great but you have to 'program' the parsing spec, rather than
| declare it.
| tyingq wrote:
| Not declarative, but Perl can do something like that.
|
| Delimeters/Regex: $ perl -ne
| '($name,$pass,$uid,$gid,$therest)=split(/:/);print "$name
| $gid\n"' /etc/passwd root 0 daemon 1 bin 2
| ...
|
| Fixed width: $ printf "1234XY\n5678AB" | perl
| -ne '($f1,$f2)=unpack("a4 a2");print "$f2 $f1\n"' XY 1234
| AB 5678
|
| I believe Rexx's parse is fancier still, but this is reasonably
| close.
| twic wrote:
| > Awk is great but you have to 'program' the parsing spec,
| rather than declare it.
|
| You could probably turn a declarative spec into an awk program
| with an awk program.
___________________________________________________________________
(page generated 2021-07-10 23:00 UTC)