[HN Gopher] Cheating at a company group activity using Unix tools
       ___________________________________________________________________
        
       Cheating at a company group activity using Unix tools
        
       Author : devenvdev
       Score  : 123 points
       Date   : 2021-12-12 10:20 UTC (1 days ago)
        
 (HTM) web link (medium.com)
 (TXT) w3m dump (medium.com)
        
       | tzs wrote:
       | The 'comm' command should be in there. With no options 'comm'
       | takes two files, F1 and F2, which should be lexically sorted, and
       | produces 3 columns of output.
       | 
       | The first column consists of lines that are only in F1, the
       | second column consist of lines that are only in F2, and the third
       | column consists of lines that are common to both files.
       | 
       | The option -1 tells it to not print column 1, -2 tells it not to
       | print column 2, and -3 does the same for column 3. These can be
       | combined, so -12 would only print column 3 (the lines that are in
       | both files) and -13 would only print column 2 (the lines that are
       | in F2 but not F1).
        
       | l0b0 wrote:
       | ls | grep '.csv$' | xargs cat | grep 'cake' | cut -d, -f2,3 >
       | cakes.csv
       | 
       | That's quite a few antipatterns in one go. Unless you have a
       | bajillion files the `xargs` is unnecessary, the `cat` and `ls`
       | are unnecessary (and `ls` in shell scripts is a whole class of
       | antipatterns by itself). You might want to use something like
       | this instead:                 grep cake *.csv | cut -d, -f2,3 >
       | cakes.csv
        
         | darrenf wrote:
         | Almost, but not quite :)                   $ ls | grep '.csv$'
         | | xargs cat | grep 'cake' | cut -d, -f2,3         cake
         | cake         cake         $ grep cake *.csv | cut -d, -f2,3
         | bar.csv:cake         foo.csv:cake         quux.csv:cake
         | 
         | `grep -h` will do the trick though
        
           | darrenf wrote:
           | I'm surprised, hours later, that no-one has pointed out my
           | error here! So I'll do it myself - upon revisiting it, I
           | realised my input files weren't CSVs with the number of
           | fields required for `cut` to do the intended thing. Had I
           | made them so, the `-h` isn't required after all. My bad :)
        
         | devenvdev wrote:
         | Yes! Agree. This post is more educational than practical
         | (albeit the title). You need to start somewhere and know the
         | basics to understand these details and caveats to feel what
         | quality code means in this context. I chose to use the long,
         | redundant version to show that chaining a gazillion different
         | commands is ok.
        
         | jrm4 wrote:
         | I really wish people would mostly stop doing this type of
         | utterly and completely necessary "correction." It really misses
         | the point, and exemplifies nearly the exact opposite of the
         | point and value the original article is expressing.
         | 
         | This is a tool used to accomplish a thing, and Unix tools can
         | be used to accomplish things in many different ways. This is
         | like complaining about, I don't know, using a metric-labeled
         | screwdriver on an imperial-measured screw that still gets the
         | job done exactly as needed. Cut it out.
        
           | cuu508 wrote:
           | This is pointing out you can simplify the
           | drill+pliers+plunger+screwdriver contraption to just
           | screwdriver.
        
           | bonkabonka wrote:
           | The original example _MUST_ be corrected for the same reason
           | that folks who post naive code snippets adding SQL strings
           | together with user input must be corrected.
           | 
           | It is not a matter of taste and it is not a matter of metric
           | versus imperial screwdrivers. Someone will copy this code and
           | it will end up being an attack vector where it will have
           | consequences.
           | 
           | I imagine you're rolling your eyes and have flipped the bozo
           | bit but please bear with me.
           | 
           | Think of the teachable moment this presents! The author of
           | the original piece goes back and annotates their original
           | answer along the lines of, "you might solve it this way but
           | there are some gotchas with it - let me show you what could
           | go wrong."
           | 
           | As an industry we absolutely need to circle back with
           | improvements so that those who come after us can build on a
           | more solid foundation.
        
             | vorador wrote:
             | Genuinely curious, what's so bad about the original code
             | that it adds an attack vector?
        
               | bonkabonka wrote:
               | To guard against malicious filenames I use `find ...
               | -print0 | xargs -r0...` since posix disallows null bytes
               | (and forward slashes) in filenames. The `-r` flag on
               | xargs means it doesn't execute its command if find
               | matches nothing.
               | 
               | So filenames can contain valid commands delimited with
               | semi-colons that, if not quoted properly, can be
               | unexpectedly run alongside your intended pipeline (say if
               | you're doing the usual and unsafe "for csv in *.csv; do
               | cat $csv; done").
               | 
               | I wish I could've laid my hands on the excellent HN
               | thread from some years back that opened my eyes to this
               | vector, but I'm hopeful someone else will mention it so I
               | can add it to my bash notes file. :P
        
           | nicce wrote:
           | It might not miss the point, if the command sample is giving
           | more complicated demonstration than it actually is, almost
           | like doing artificially long command to demonstrate unix-
           | magic, to get more audience.
           | 
           | Anti-patterns are bad, because it usually means that sample
           | command might work in this case, but not in other
           | environments or other use-cases. Someone who is seeing these
           | commands first time, has no idea about that. And this post is
           | meant for beginners.
        
             | jrm4 wrote:
             | I don't think it's _more complicated._ Most of the time I
             | see people doing  "corrections" on the so-called proper way
             | to use Unix tools, they're favoring conciseness (sure,
             | whatever) over _intuitiveness._
             | 
             | I uselessly use "cat" ALL THE TIME, because the Unix pipe
             | metaphor is absolutely intuitively _excellent._ I know that
             | I 'm starting at the beginning with pipe and shoving data
             | through, and I have a good idea about each step.
        
           | DarylZero wrote:
           | > using a metric-labeled screwdriver on an imperial-measured
           | screw that still gets the job done exactly as needed
           | 
           | This is super ignorant. You risk stripping the screw and
           | getting yourself into a frustrating screw extraction job.
           | 
           | Just because you lived, doesn't mean it was safe.
        
             | howdydoo wrote:
             | Slightly off topic, but is this something people worry
             | about in the real world? I have about 9 random
             | screwdrivers, and when I need one I just try them all until
             | one seems to fit. And if it doesn't, I... stop, instead of
             | stripping the screw. I couldn't tell you if a single one of
             | them was metric or imperial
        
               | DarylZero wrote:
               | There is no such thing as metric or imperial screw heads.
               | There are various ISO/ANSI standards.
        
               | jrm4 wrote:
               | :) That's the other thing about my example, it was just a
               | doofy little thing I made up.
        
             | jrm4 wrote:
             | Sure, let's beat up this metaphor.
             | 
             | I would say the ignorant one of us is the one who isn't
             | taking into account the nature of the job being done? What
             | makes you think that a) my task is life or death and also
             | b) I'm not equipped to figure out how dangerous it is?
             | 
             | To pull out the camera. I'd like to spread Unix tools to
             | everyone, or at least "more people than use them now." I
             | understand though, for a lot of people, they'd rather
             | gatekeep.
        
               | DarylZero wrote:
               | It doesn't matter whether it's life or death. If you end
               | up having to extract that screw it's going to be a big
               | headache.
               | 
               | You just shouldn't use an "almost"-fitting screwdriver or
               | wrench.
               | 
               | I'm not trying to "gatekeep." I'm not trying to talk
               | about the shell metaphorically. I'm literally talking
               | about screws.
        
               | carapace wrote:
               | You picked a bad metaphor, eh?
               | 
               | > The primary cause of this discrepancy was that one
               | piece of ground software supplied by Lockheed Martin
               | produced results in a United States customary unit,
               | contrary to its Software Interface Specification (SIS),
               | while a second system, supplied by NASA, expected those
               | results to be in SI units, in accordance with the SIS.
               | 
               | https://en.wikipedia.org/wiki/Mars_Climate_Orbiter#Cause_
               | of_...
               | 
               | It's hardly gatekeeping to try to teach people better
               | ways to do things.
        
               | jrm4 wrote:
               | No, it's a good one. What y'all mistakenly assumed was
               | the idea of "every computer task is a life or death one"
               | instead of considering the idea that there can be
               | different levels of care for different levels of danger.
               | 
               | The reason many people in IT can't think outside of this
               | idea is that they are (subconsciously) reluctant to open
               | the gates for others. As a teacher in IT I quite
               | literally deal with this every day.
        
               | carapace wrote:
               | I really don't think that teaching people better ways to
               | do things is exclusionary.
               | 
               | The OP even included the reasons _why_ their version is
               | preferable.
               | 
               | You really see that as complaining and gatekeeping? An
               | 'utterly and completely necessary "correction."'?
        
         | [deleted]
        
         | gbrown_ wrote:
         | I certainly agree with your points, though the original task is
         | a textbook example of where awk shines.                   awk
         | 'BEGIN {FS=","} /cake/ {print $2, $3}' *.csv > cakes.csv
         | 
         | Unsurprisingly I disagree with the post's description of awk
         | being an "advanced command".
        
           | 1_player wrote:
           | awk is one of those underrated tools I always wanted to
           | learn.
           | 
           | I still remember when I started working as a sysadmin at 19,
           | the greybeard UNIX guy taught me how to vim, and he told me
           | awk was as important as knowing vim, pointing to some huge
           | AWK manual he had on the shelf, one of those with the animal
           | in the cover.
           | 
           | This was 15 years ago, I know vim, but awk still eludes me.
        
             | barrkel wrote:
             | I wrote one script in awk and learned a lot:
             | 
             | - I understood where a lot of the stuff in Perl (and, to a
             | lesser degree, Ruby) came from, especially Perl's implicit
             | loop mode.
             | 
             | - I never wanted to write a program in awk again. Ruby (or
             | whatever your preferred scripting language is) is better
             | every time: awk is not much less work and Ruby generalizes
             | more.
             | 
             | Awk for scripting is a bit like shell for scripting. It's
             | easy to extend a script into something which is
             | uncomfortably expensive to rewrite when you inevitably want
             | to do something it's not well suited to. Awk still has
             | uses, but it's in things like portable shell scripts which
             | need more text manipulation power.
        
             | yesenadam wrote:
             | Read _The AWK Programming Language_ , a joy to read, one of
             | the finest docs ever written, I reckon. You'll be up and
             | running in minutes.
             | 
             | https://archive.org/details/pdfy-MgN0H1joIoDVoIC7
        
             | asicsp wrote:
             | I wrote a book on GNU awk one-liners with hundreds of
             | examples and exercises [0]. Free to read online.
             | 
             | For a quick introduction, see [1] [2]
             | 
             | [0] https://github.com/learnbyexample/learn_gnuawk
             | 
             | [1] https://backreference.org/2010/02/10/idiomatic-awk/
             | 
             | [2] https://earthly.dev/blog/awk-examples/
        
             | smcameron wrote:
             | If you know C and regex, that's 85% to 95% of awk. The
             | basic layout of an awk program is a sequence of
             | regex/action pairs, where the regexes are between slashes,
             | and the actions are between curly braces. (There's also the
             | special regex values of BEGIN and END, executed prior to
             | the first line of input and after all input is consumed,
             | respectively). The action part has a syntax very much like
             | C, with the special values $n (where n is an integer)
             | representing field number n for a any input line. There are
             | a bunch of special functions it defines that you can call,
             | like substr(), or split(), and good old printf. Awk
             | processes the input stream one line at a time, matching
             | each line against the series of regexes, and if it matches
             | a regex, it executes the action paired with that regex,
             | with all the $n variables filled in with field values from
             | that input line. And that's basically enough for 99% of
             | what awk is used for on a day to day basis.
        
           | yesenadam wrote:
           | awk -F, '/cake/ {print $2, $3}' *.csv > cakes.csv
           | 
           | does the same thing: -F sets the field separator.
        
           | gorgoiler wrote:
           | My thoughts exactly, though with "-F," to quickly set the
           | field separator.
        
           | jasode wrote:
           | _> I disagree with the post's description of awk being an
           | "advanced command"_
           | 
           | I guess there's no categorization of "advanced" vs "beginner"
           | that will satisfy every audience but I consider awk an
           | advanced tool. About 20 years ago, I wrote some AWK tips and
           | cheatsheet back on USENET and today I would have to refer to
           | that post to write basic awk commands.
           | 
           | The thing about awk is that it's a compact _programming
           | language_ with variables and conditionals and that 's a step-
           | change in complexity for many users.
        
           | nickjj wrote:
           | > Unsurprisingly I disagree with the post's description of
           | awk being an "advanced command".
           | 
           | I think it can be pretty advanced, for me awk is one of those
           | tools where I still feel like I need to write a paragraph of
           | comments to explain 1 line of code.
           | 
           | For example: https://github.com/nickjj/invoice/blob/75660dce5
           | a29ceb4e47a6...
           | 
           | Keep in mind I don't really "know" awk. I cobbled that
           | together from a few examples. It will convert times formatted
           | like "2h 30m", "150m" or "2:30" into 2.50. There's a bunch of
           | examples in the test file.
           | 
           |  _NOTE: I wrote that script 2.5 years ago and I know there 's
           | questionable patterns in other areas of the script that's not
           | highlighted like using a bunch of separate echo calls instead
           | of a heredoc._
           | 
           | Shell scripting is really fun and efficient. I use it all the
           | time for a variety of things.
        
             | nicce wrote:
             | awk is Turing-complete, so if that is not enough to
             | describe "advanced" tool, then I don't know what it is.
        
               | cross wrote:
               | So is sed.
        
               | vlovich123 wrote:
               | I'd classify basic search/replace functionality of sed as
               | "basic" - you see such scripts frequently enough in shell
               | scripts. Same for awk and "extract fields 1 and 3".
               | Anything beyond that quickly escalates into intermediate
               | and advanced usages. Often you can find an SO post for
               | lots of intermediate usages (where you can just include
               | it as a link saying "I'm trying to do X") but advanced
               | usages that require knowledge of that are best avoided
               | unless you are investing in sed/awk expertise in your
               | team. Usually choosing something like Python, Ruby, Node
               | or even Perl can provide better value because all of
               | those languages can solve the same problem and solve
               | other problems those tools can't.
        
           | devnull255 wrote:
           | Awk is advanced in the sense that it is a programming
           | language by itself. Years ago I had to migrate an old version
           | of an Informix database to a newer version. The old version
           | did not have tools to export the database as DDL and DML
           | statements. So I had to create a tool myself to do it. The
           | system did not have perl installed so I had to use awk. It
           | worked nicely enough.
        
         | XorNot wrote:
         | I'd go further and say don't parse CSV with plaintext tools
         | because it's barely a plaintext format. Use a CSV library and
         | save yourself heartache when someone drops a quoted string in
         | somewhere.
        
           | oblio wrote:
           | I don't understand why you're being downvoted.
           | 
           | Parsing CSV with simple text-oriented tools is bad of an idea
           | as parsing HTML with regexps.
        
             | devenvdev wrote:
             | Most likely because it contradicts most people's
             | experience, some CSVs can't be parsed with cli tools, but
             | most of them can be, and it's much easier than writing code
             | that does the same. So what the parent commenter says is
             | true, just not pragmatic.
        
               | oblio wrote:
               | It is pragmatic with a minor tweak.
               | 
               | https://csvkit.readthedocs.io/en/latest/
               | 
               | https://github.com/BurntSushi/xsv
               | 
               | Heck, even sqlite has some stuff to help with processing
               | CSV files: https://www.sqlite.org/csv.html
        
           | ryanianian wrote:
           | This isn't parsing CSV, it's generating it. That's not nearly
           | as fraught.
        
         | mdoms wrote:
         | It will also fall flat on its face for CSV files that contain
         | values which contain escaped commas.
        
       | mattrighetti wrote:
       | For those interested in this topic I would suggest these
       | incredible lectures by MIT [0], especially the data wrangling
       | one.
       | 
       | Lectures are hosted on YouTube, they are extremely valuable and
       | easy to follow and they give a pretty good insight on a lot of
       | Unix topics.
       | 
       | [0]: https://missing.csail.mit.edu/2020/
        
       | unixbane wrote:
       | > Was it worth it?
       | 
       | > 1 minute to do this
       | 
       | > 1 minute to do that
       | 
       | and 1 minute to introduce RCE vulns into company #589179283672's
       | pipeline due to the "you don't understand the security
       | implications of using fragile UN*X tools" problem which applies
       | to anyone actually learning something from this article DAY OF
       | THE SEAL SOON,
        
       | dsr_ wrote:
       | "After some digging, it was easy to find the HTTP request that
       | pulled this information from the server. And it even had all the
       | birthdates in the JSON!"
       | 
       | HR needs to know this, but it shouldn't be available to random
       | employees.
        
         | devenvdev wrote:
         | Why though? We use hibob, and anyone can find anyone's full
         | name and birthday via UI anyway. Are there any compliance
         | issues with this?
        
           | toomuchtodo wrote:
           | It's PII and should be restricted to only those who require
           | the data for their job (HR).
        
           | dsr_ wrote:
           | Do you live in a place where there are anti-discrimination
           | laws concerning employee ages? If not, perhaps there is no
           | issue for you.
        
             | actually_a_dog wrote:
             | Anybody who can effectively discriminate against an
             | employee based on age probably has access to that info
             | directly via some HR system anyway.
        
       | amtamt wrote:
       | Extension of classic problem from "Programming Pearls" by John
       | Bentley. Nice to see such pragmatism for one time problems.
        
         | carapace wrote:
         | s/John/Jon/
        
       | smitty1e wrote:
       | For doing work with JSON data, I'd add:
       | 
       | https://stedolan.github.io/jq/
        
         | b6z wrote:
         | I don't understand what you mean. Half of the article is using
         | jq.
        
       | ccalloway wrote:
       | Most of the justifications for using collections of command-line
       | Unix tools are no longer valid today. Instead you should be using
       | a proper programming language.
       | 
       | Note that people who still do use complex solutions built from
       | cat, head, cut, etc, and who know what they're doing, will
       | typically either write a shell script (which won't be structured
       | particularly differently from the equivalent Python or whatever)
       | or will rely heavily on awk (itself a full-featured programming
       | language, no easier to learn than any other scripting language),
       | or both.
       | 
       | One-liners which pipe text between four or five different
       | commands are the equivalent of hand-soldered boards or bitwise
       | arithmetic. Interesting to learn about for historical reasons but
       | of no practical utility.
       | 
       | The use of things like xargs and jq in this solution, difficult
       | to invoke Unix utilities for doing things that are trivial in any
       | reasonable language, makes this even more clear.
        
         | high_5 wrote:
         | > Interesting to learn about for historical reasons but of no
         | practical utility.
         | 
         | The practical utility has just been demonstrated in this
         | particular article? Historical? I think that Unix shell are
         | like crocodiles - outliving the dinosaurs and lurking in the
         | murky water the unsuspecting sysadmin to come close enough to
         | fix the script that ain't broken.
        
           | ccalloway wrote:
           | > The practical utility has just been demonstrated in this
           | particular article?
           | 
           | How? The author is doing something which could be done much
           | more easily and elegantly in a programming language.
        
           | oblio wrote:
           | Your analogy is quite interesting, in the sense that
           | crocodiles are around, but they're not the dominant species.
           | That would be an interesting continuation to your analogy,
           | actually.
        
             | high_5 wrote:
             | Sooner or later one has to take a sip from the unixy pond
             | :-)
        
               | oblio wrote:
               | For now. At some point we're going to build proper water
               | management and sanitation systems. What those are in the
               | software world, I, as a swamp dweller, cannot say.
        
         | spekcular wrote:
         | I don't understand why you're being so harshly downvoted. This
         | seems ... plausibly correct to me?
         | 
         | Some questions:
         | 
         | 1) Where do you draw the line, and move away from shell to a
         | "real" language. If I just want want to view a directory,
         | surely I use ls, right? What about if I want to remove all
         | leading whitespace from a text file? This can be done with a
         | short but fairly opaque awk one-liner. Probably takes way more
         | time to write and run the Python equivalent, but I don't know
         | Python so well, so maybe it's also a one-liner.
         | 
         | 2) What's the optimal "real language" for replacing shell?
         | Python? Perl? Raku?
         | 
         | 2a) xargs automatically parallelizes programs. How can this be
         | done efficiently (meaning I don't have to write much additional
         | code) in your proposed "real language"?
        
           | actually_a_dog wrote:
           | We had a rule at my old workplace that any shell script
           | that's over a page of code was to be replaced with a Python
           | script. We chose Python because guaranteed there would always
           | be an up to date version of Python on any cloud machine we
           | used, but you could certainly make a case for some other
           | language. In a vacuum, my choice might be Scheme.
        
           | dagw wrote:
           | _xargs automatically parallelizes programs. How can this be
           | done efficiently (meaning I don 't have to write much
           | additional code) in your proposed "real language"?_
           | 
           | By using xargs :)
           | 
           | Or more likely gnu parallel. But seriously, in so many cases
           | using GNU parallel to parallelize a process is the quickest
           | and easiest way to approach the problem, and I use it all the
           | time. If I need to process 10k+ images in a folder, rather
           | than try to parallelize the process in my python/C script
           | I'll write the fastest possible single threaded script that
           | takes its input args as command line arguments and then uses
           | gnu parallel to distribute the workload. The added advantage
           | is that I can distribute this work on a cluster of machines
           | with only a few changes to GNU Parallel's command line
           | arguments.
        
         | smitty1e wrote:
         | > command-line Unix tools are no longer valid today
         | 
         | Strongly disagree. The understanding of the OS, the data, and
         | how to checkmate the problem with minimal effort is timeless.
         | 
         | Programming languages are relatively ephemeral compared to
         | POSIX utilities.
         | 
         | Invest in knowledge of the enduring.
        
         | akho wrote:
         | > Instead you should be using a proper programming language.
         | 
         | Why use many word when few word do trick?
         | 
         | Shell scripts are very compact for what they do, and generally
         | as clear as a similarly quick-and-dirty solution in another
         | language. Ease of learning and use comes from having used the
         | programs you are running in your script. E. g. my small script
         | piping stuff from nmcli to fzf so I can choose a wifi network
         | would be much more difficult to write in Python: I'd need to
         | find a Python library for interacting with NetworkManager, and
         | a library for interactively fuzzy searching in a list, read the
         | docs, spend a while setting up a venv to run it, ... I don't
         | have time for any of that.
         | 
         | xargs, in particular, is not difficult to invoke once you've
         | done it once or twice, and does a lot. Apart from just being a
         | loop, it also parallelizes execution, and can ask for
         | confirmation from the user for each invocation. Implementing
         | either of these features will take you more than 2-4 chars you
         | need to use it with xargs.
        
         | aulin wrote:
         | wait, what's wrong with hand soldering and bitwise arithmetic?
         | hand soldering is still of very much practical utility for
         | building prototypes, small production batches, electronics
         | repair... Bitwise arithmetic... seriously? tell that to an
         | embedded developer or anyone who works with low-level stuff.
        
           | ccalloway wrote:
           | Nothing is wrong with them if 1) you have a specifically
           | suited, niche problem and you understand the complexities and
           | tradeoffs of the tool OR 2) it's not a critical requirement
           | to solve the problem using modern, efficient tools AND you
           | want to use something else for fun or learning or whatever.
           | 
           | If this isn't the case, stay away.
        
         | Kinrany wrote:
         | No "proper programming language" is capable of ergonomically
         | piping between programs.
         | 
         | Shell is indeed very old and it's time for a replacement, but
         | it's not there yet.
         | 
         | Oilshell might get there eventually or at least spark interest
         | in this area.
        
           | ilyash wrote:
           | > No "proper programming language" is capable of
           | ergonomically piping between programs.
           | 
           | Solved. https://github.com/ngs-lang/ngs
           | 
           | I'm the author. Frustrated with exactly this situation I
           | created Next Generation Shell. It's a "proper programming
           | language" on one hand but domain-specific for "DevOps"y
           | scripting on another. So sane syntax, data structures, error
           | handling, multiple dispatch on one hand but also syntax for
           | running external programs, pipes and redirects.
           | 
           | You are welcome!
        
           | ccalloway wrote:
           | > No "proper programming language" is capable of
           | ergonomically piping between programs.
           | 
           | You cherry-picked the one thing that shell script is somewhat
           | better at than other languages (not even really needed for
           | this task). Meanwhile the article uses shell for both making
           | HTTP requests and mangling JSON data, both of which are easy
           | in all modern languages, and extremely painful in shell.
        
             | actually_a_dog wrote:
             | It's not cherry picking to mention the thing that makes a
             | fundamental component of the UNIX philosophy work.
             | 
             | > (ii) Expect the output of every program to become the
             | input to another, as yet unknown, program. Don't clutter
             | output with extraneous information. Avoid stringently
             | columnar or binary input formats. Don't insist on
             | interactive input.
             | 
             | https://homepage.cs.uri.edu/~thenry/resources/unix_art/ch01
             | s...
        
         | flohofwoe wrote:
         | Fundamentally it's the same thing. The UNIX tools are the
         | "batteries included" standard library of the integrated shell
         | programming environment. And if you need to extend that library
         | you quickly whip up a new minimal command line tool in C (or
         | any other language which allows to write small and quick and
         | dirty command line tools, like Python).
         | 
         | The only downside of shell scripting is that it isn't trivially
         | portable to Windows (or even macOS because of the differences
         | between GNU and BSD tools), so it often makes sense to create
         | big Python scripts that do more than "one thing right". If the
         | whole world would run on UNIX, shell scripting would make much
         | more sense.
        
           | actually_a_dog wrote:
           | > The only downside of shell scripting is....
           | 
           | The _only_ downside? Let 's add that no major *NIX shell that
           | I'm aware of has any good way to modularize code while
           | enforcing encapsulation of state.
           | 
           | At my previous job, we had a rule that any shell script
           | longer than about a page of code had to be replaced with a
           | Python script ASAP. That was a good rule, IMO, because once
           | you've exceeded a certain size, a shell script starts getting
           | brittle and hard to work with. I don't know if 1 page of code
           | is the threshold size or not, but it seems like as good a
           | cutoff as any.
        
         | hawski wrote:
         | Shell is a very high level language that is undoubtedly bizarre
         | at times. Its routines can be written in any programming
         | language you want without much of a boilerplate or any special
         | bindings, because supporting for command line arguments,
         | environment variables, reading and writing files is a bare
         | minimum that most languages support. It is a tradeoff - it has
         | its immense advantages and a couple of often hard to navigate
         | disadvantages. But the ability to easily compose whatever you
         | want is something you can't ignore.
         | 
         | I think the perfect very high level language is closer to shell
         | than to python for example. The power of Tcl/Tk (still it has
         | some big weaknesses) or Rebol/Red is something that I admire.
         | 
         | The following statement is probably controversial: shell is
         | more akin to Lisp for human beings. I dabbled at Scheme, but it
         | is harder for me to grasp than Shell, but in the end they are
         | more similar than not.
         | 
         | I hold a candle for Oil shell for example.
        
         | herbst wrote:
         | This is only partially true. Working on your own machine this
         | may be fully the case, but debugging and maintaining random
         | servers is a complete different beast.
         | 
         | Shell is portable in a way nothing else is, same reason people
         | use Excel instead of code or PHP instead of literally anything.
        
           | ccalloway wrote:
           | Firstly the model of sshing into your server and trying to
           | run commands on it directly is more or less obsolete.
           | 
           | Secondly the presence of all these utilities on a machine is
           | far from guaranteed - expecting Python to be present is no
           | more or less likely.
        
             | spekcular wrote:
             | How is it obsolete? This is an honest question. I SSH into
             | the personal server hosting my website and tweak things all
             | the time. If there's an easier way to get things done - let
             | me know!
        
               | ccalloway wrote:
               | I mean, the example you gave of what you use ssh for is
               | itself an obsolete function. Using a personal Unix server
               | to host a personal website is not really any different
               | than making your own clothes using a sewing machine. It
               | is perfectly fine - but it's a niche activity done by
               | hobbyists who enjoy the process itself.
        
               | spekcular wrote:
               | How should I be hosting my website then? Suppose for the
               | sake of an example it's some simple custom HTML, CSS, and
               | a few javascript math apps. I don't enjoy the process, I
               | just need the damn website online.
               | 
               | (Again - an honest question! I was never formally
               | instructed in these matters.)
        
               | oblio wrote:
               | At a higher level of operation, companies are switching
               | to immutable infrastructure.
               | 
               | So you don't need to SSH to debug, you'd just re-deploy
               | another container or something.
               | 
               | However, there's an <<extremely>> long tail of places,
               | equivalent to mom and pop stores, where SSH will still be
               | in use for a long, long time. I include antiquated banks
               | or government in "mom and pop stores" for the purposes of
               | this discussion :-p
        
             | cranekam wrote:
             | > Firstly the model of sshing into your server and trying
             | to run commands on it directly is more or less obsolete.
             | 
             | Why do you get to declare what is acceptable (using UNIX
             | tools) or what's now obsolete (SSHing into a host)? It's
             | rather presumptuous to believe you can speak for everyone.
             | Perhaps you only ever deploy code using k8s and never need
             | to use a command line but that doesn't mean we all do.
             | There are many reasons one would use these tools and
             | approaches (fun, small scale, investigating problems,
             | anything).
        
               | ccalloway wrote:
               | You might think 'fun, small scale, investigating
               | problems' are good justifications for your working
               | practices, but your boss's boss's boss does not. They
               | would prefer you to have less fun. Or perhaps to have fun
               | on your own time after being laid off.
        
               | herbst wrote:
               | My boss (me) doesn't mind me dabbling around in my
               | servers via SSH. Not everyone of us works for evil
               | cooperate.
        
               | ccalloway wrote:
               | That's fine but you are proving my point. I didn't say
               | that no one uses shell, just that there aren't good
               | practical reasons for doing so.
               | 
               | Of course lots of people do it because they enjoy it or
               | are used to it, and for them cost and related
               | considerations don't matter. The same way that lots of
               | people ride horses.
        
               | cranekam wrote:
               | _head in hands_
               | 
               | For what it's worth, my bosses all the way up to CEO of a
               | 50k person company were all pretty pleased with the
               | working practices I used (medieval things like strace(1),
               | perf(1) and sort(1)) to save thousands of servers' worth
               | of CPU. I'm not advocating for editing code live over SSH
               | but there are still plenty of legitimate reasons to SSH
               | into a server and use UNIX tools.
               | 
               | Since you're clearly trolling I'll leave it here.
        
             | flohofwoe wrote:
             | First, Python2, or Python3, if Python3, what incompatible
             | version exactly? And what's the executable name, python,
             | python2, python3?
             | 
             | On macOS, there's still only Python 2.7 pre-installed
             | (Xcode then goes ahead and also installs an equally
             | outdated Python 3 version). On Windows the Python
             | executable is an empty stub which opens the app store.
             | 
             | I wish the situation would be better.
        
               | oblio wrote:
               | > On Windows the Python executable is an empty stub which
               | opens the app store.
               | 
               | As opposed to trying to directly launch /bin/sh on
               | Windows? :-p
        
               | microtherion wrote:
               | I would argue, however, that the situation for shells is
               | even worse, as:
               | 
               | 1. The version of bash is even more out of date (Dating
               | back to 2007 on current macOS releases).
               | 
               | 2. #!/bin/sh Could give you any one of a number of
               | slightly different shell implementations.
               | 
               | 3. The differences between Python versions are clearly
               | documented, and, while painful, possible to work around.
               | The differences between shell versions can be far more
               | subtle and more difficult to discover.
               | 
               | 4. You not only need to deal with the shell version, but
               | the version of every tool you use as well.
        
               | devenvdev wrote:
               | I would argue that in practice, the only difference
               | between implementations was an inconvenience to me was on
               | Mac - I have to use gsed instead of sed, etc. Everywhere
               | else, it's pretty much consistent from the one-liner user
               | POV. On the other hand, I occasionally encounter
               | different python versions, which always includes some
               | mental effort to tune in.
        
         | 2143 wrote:
         | > Most of the justifications for using collections of command-
         | line Unix tools are no longer valid today.
         | 
         | Why is it not valid today?
         | 
         | > One-liners which pipe text between four or five different
         | commands are the equivalent of hand-soldered boards or bitwise
         | arithmetic.
         | 
         | Why is that bad? Don't deploy a supercomputer to do what a hand
         | soldered board can. Keep it simple.
         | 
         | I don't see how you equate piping commands to bitwise
         | arithmetic, but bitwise arithmetic is easy anyway.
         | 
         | > but of no practical utility
         | 
         | Says you. Just because you don't find something useful doesn't
         | mean nobody else finds it useful.
         | 
         | Turning one liner piped commands to a program in what you might
         | consider a "proper programming language" usually ends up
         | turning a declarative program into something prodecural. Not
         | that that's necessarily a bad thing. Just saying.
         | 
         | Use the right tool for the job. In some cases (not all) the
         | shell is indeed the right tool.
         | 
         | I get a feeling you don't understand the Unix philosophy. Read
         | The Art of Unix Programming by Eric S. Raymond. Go learn
         | bitwise arithmetic.
         | 
         | The uneducated play with pictures. Educated people read and
         | write :)
         | 
         | Have a great day (or night, depending on your timezone -- night
         | here).
        
         | smcameron wrote:
         | This is akin to saying if you want to hang a picture in your
         | house, you shouldn't just grab a hammer and a nail, instead,
         | you should get a nice piece of wood, and a nice hunk of metal,
         | go to the workshop, fire up the forge, the mill and various
         | wood and metal working machines, and forge a special picture-
         | hanging-hammerer-thing.
        
           | ccalloway wrote:
           | Umm, no, chaining together 5 or more somewhat arcane single-
           | purpose tools is the unrealistic solution here.
           | 
           | If the article was about using ls to just list files, and I
           | had said "actually you should use Python's os.listdir() and
           | filter the results by whatever" you would be right.
           | 
           | For most simple problems it's correct to use a simple tool.
           | For the overwhelming majority of complex problems you should
           | use a well-understood, well-designed, common general-purpose
           | tool.
        
             | pessimizer wrote:
             | What's supposed to be the difference between chaining
             | simple UNIX commands and chaining simple python functions
             | again?
        
         | Lamad123 wrote:
         | I heard awk is extremely fast, much faster than Python.
        
           | hansel_der wrote:
           | i heard python is among of the slowest
        
           | revscat wrote:
           | It is, but it's much closer to perl when it comes to
           | maintainability.
        
           | ccalloway wrote:
           | Yes, but execution speed (of things like creating a lookup
           | table from the second and third fields in each line) is
           | rarely the operative constraint.
        
             | jes wrote:
             | What is most often the operative constraint?
        
               | ccalloway wrote:
               | Time spent maintaining your script by your future co-
               | workers.
        
         | pkrumins wrote:
         | Sir, you are absolutely and totally wrong. No sysadmin has time
         | or interest to write programs or use "proper programming
         | languages" to get the job done. Sysadmins know their tools and
         | are fast and efficient. Zero sysadmins will ever write "proper
         | programs" when they can pipe find, sed, and awk.
        
           | ccalloway wrote:
           | Yes, but the number of sysadmins in the world is converging
           | rapidly to zero. Mainly because the invisible overhead of
           | having your boxes managed by people who use sed and awk is
           | much higher than the alternatives (containerisation, cloud
           | vendors, cattle not pets, infrastructure as code).
           | 
           | I think you actually confirmed my point when you said 'no
           | sysadmin has time or interest'. I didn't say that no one uses
           | shell anymore. I said that there was no practical
           | justification for doing so. Lots of people still use it and
           | don't have any _interest_ in finding a better solution. These
           | people are choosing, for their own reasons, to double down on
           | an obsolete skillset. Good for them, I suppose, but I don 't
           | think that demand for their niche is going to be around for
           | very long.
           | 
           | I also pointed out that sysadmins who know what they're doing
           | use awk. Based on your mentioning awk it seems like you agree
           | with this.
        
             | ozim wrote:
             | I care to disagree because you just move one abstraction
             | layer higher and again you will use sed and awk.
             | 
             | To move stuff to that cattle and to maintain cloud
             | environments, to prepare k8s clusters and manage those. To
             | create docker images or to maintain servers that are
             | running k8s.
             | 
             | You still have AWS console and Azure CLI or GCP console and
             | configurations that need to be search/replaced. You still
             | need to parse logs to understand what is happening in
             | multiple environments, which I would say makes it even more
             | important to be good with unix-fu because you have to find
             | that one broken cow to put it out of misery.
             | 
             | Those command line tools are still useful and used.
        
               | ccalloway wrote:
               | Did you read or understand my original comment?
               | 
               | I specifically named awk as a way of solving these type
               | of problems that has a similar applicability and maturity
               | to a scripting language. My real objection is to long
               | chains of cut, head, cat, grep, and so on. I also think
               | that utilities like jq and xargs, which were invented
               | exactly to allow people to do things in shell one-liners
               | which are trivial in a proper language, are worse than
               | useless.
               | 
               | I didn't say that no one should use awk, or no one should
               | use the command line.
               | 
               | In fact I didn't say that no one should use pipes with
               | head, cut and grep either. I just said that _most_
               | reasons to use them were no longer valid.
        
             | DarylZero wrote:
             | >containerisation, cloud vendors, cattle not pets,
             | infrastructure as code
             | 
             | Lots of shell script in that part of the code world.
        
             | turtlebits wrote:
             | You don't need to be a sysadmin to take advantage of CLI
             | tools. Anyone that needs to manage files and/or the
             | contents of them can benefit.
        
         | gattilorenz wrote:
         | > Most of the justifications for using collections of command-
         | line Unix tools are no longer valid today. Instead you should
         | be using a proper programming language.
         | 
         | That's just, like, your opinion man...
         | 
         | > people who still do use complex solutions built from cat,
         | head, cut, etc, and who know what they're doing, will typically
         | either write a shell script or use awk
         | 
         | No, I still use cat, head, cut etc., because it's easier to see
         | at each step what is happening and incrementally add to that,
         | because they're literally everywhere, because it's quick, and
         | because I like it. Not for major projects, granted, but why
         | would I need to write a python file for something that takes a
         | single line of piped commands?
        
           | pkrumins wrote:
           | Amen.
        
         | 3np wrote:
         | I guess it depends on what you do. For a subset of tasks, shell
         | scripting is a lot faster to implement than the equivalent
         | python/js/go/ruby/rust.
        
       | perryizgr8 wrote:
       | > regex is so ubiquitous and valuable that if you don't know it
       | yet, you should learn it)
       | 
       | Regex is one of those things I have to learn every single time I
       | need to use it. I just can't seem to force myself to remember.
        
       | pkrumins wrote:
       | The first example is super super bad here. Never pipe `ls`. When
       | you feel like you need to pipe `ls`, then you know you want to
       | use `find`.
        
         | devenvdev wrote:
         | Yes, as I answered in another thread - this post is much more
         | educational than practical. It's intended to teach how to use
         | pipes and simple commands together. Explaining why piping `ls`
         | is bad and what is the difference between `ls` and `find`
         | commands would miss the point of the post and would be
         | confusing :)
        
           | MisterTea wrote:
           | > this post is much more educational than practical.
           | 
           | I understand that you don't want to be mean but this post is
           | neither. It's like a bad gun safety video where the alleged
           | instructor points a loaded gun at school children and then
           | looks down the barrel while polishing the trigger...
        
         | DoingIsLearning wrote:
         | Never?
         | 
         | ls |less
        
         | tzs wrote:
         | There are a lot of times one only wants non-dotfiles in the
         | current directory.
         | 
         | The find would be something like                 find . -not
         | -path '*/\.*' -type f -depth 1
         | 
         | What advantages does that have over 'ls' for that case?
        
           | jasode wrote:
           | _> What advantages does that have over 'ls' for that case?_
           | 
           | The gp was talking about _issues with piping from "ls |"_ and
           | your particular case of "find" being more convoluted than
           | "ls" isn't comparing that.
           | 
           | Example of the topic that gp was warning about:
           | 
           | http://mywiki.wooledge.org/ParsingLs
           | 
           | https://unix.stackexchange.com/questions/128985/why-not-
           | pars...
           | 
           | [Also fyi... you may have meant "-maxdepth" instead of
           | "-depth" in your example.]
        
           | revscat wrote:
           | FYI the zsh glob for this is `^.*`, e.g.:
           | echo ^.*
           | 
           | will show all non-dotfiles in the current directory.
        
         | oweiler wrote:
         | To be even more pedantic, you probably want to use a glob
        
         | renewiltord wrote:
         | I solve this problem by just having no spaces on my filesystem
         | when I act.
        
         | nixpulvis wrote:
         | I would recommend `find ... -exec`, but I still haven't figured
         | out how to make it compose properly with other UNIX tools.
        
           | revscat wrote:
           | You may want to use file globbing instead. This is one I just
           | used yesterday afternoon. I needed to search for a string in
           | every .js or .jsx file in my project, but didn't want to
           | include specs in the search.                   rg
           | 'MySearchString' **/*.js[x]#~*spec*
           | 
           | Voila. Note that this is for zsh, and you need to set the
           | EXTENDED_GLOB option. But once you do you'll find yourself
           | rarely needing to reach for `find`.
        
             | nicce wrote:
             | Is it really better option? You can find "find" from every
             | Linux system. For this sample, you would need to install
             | zsh and enable globbing manually as well. It works only on
             | your machine, but on the other hand there is chance learn
             | something which applies everywhere.
        
       ___________________________________________________________________
       (page generated 2021-12-13 23:01 UTC)