[HN Gopher] Awk Technical Notes
       ___________________________________________________________________
        
       Awk Technical Notes
        
       Author : todsacerdoti
       Score  : 94 points
       Date   : 2023-03-28 12:41 UTC (1 days ago)
        
 (HTM) web link (maximullaris.com)
 (TXT) w3m dump (maximullaris.com)
        
       | meltedcapacitor wrote:
       | Awk is an improvement on most of its successors.
       | 
       | (h/t Tony Hoare)
        
       | jrochkind1 wrote:
       | My very first job getting paid to write software was writing in
       | scripts in Awk to parse and analyze some software log files, for
       | a faculty software researcher, in, maybe, 1997? i didn't know Awk
       | before, it's just what I inherited. Spent a few hours with the
       | O'Reilly book, and I was like, okay, sure, let's go.
       | 
       | As the stuff we were doing in that project got more complex, at
       | some point someone suggested to teenage me "You might want to
       | look at Perl for this now," and then I moved to that. (with the
       | Camel O'Reilly book, of course!)
       | 
       | Haven't touched either one in years now.
       | 
       | Learning new things can be much more overwhelming for me now, I
       | don't know how much is me vs environment. But I am nostalgic for
       | those days where I'd sit down with a print book, and within hours
       | have a grasp of the fundamentals, or within days feel like I had
       | basic fundamental conceptual understanding of the whole dang
       | thing (not of every possible feature, but of the conceptual
       | framework, the big picture).
        
       | cbazz wrote:
       | [flagged]
        
         | version_five wrote:
         | This commentor is a troll, see history if the above comment
         | isn't enough.
         | 
         | Edit: what the fuck is going on?
        
           | cbazz wrote:
           | I don't get how you could conclude I'm a troll? I'm not
           | spamming nor arguing with anyone, just sharing my opinions
           | and experiences.
        
             | meindnoch wrote:
             | No, you're copy/pasting low-effort ChatGPT babble.
        
           | bioemerl wrote:
           | Looks like they have a history of using chatGPT to post
           | comments, specifically.
        
             | version_five wrote:
             | Right, but somehow they've been the top post for 1/2 hour
             | and I got modded way down for pointing out it was an
             | obvious troll. I hesitate to comment because I assume
             | that's what the script kiddie is looking for out of this.
        
               | bioemerl wrote:
               | I think the problem you're running into is that this
               | particular comment looks human written?
        
       | 2h wrote:
       | I used AWK for many years, but one day I realized that I had
       | pushed AWK beyond whats its meant for, same as the author here.
       | classic red flag from the article:                   function
       | NUMBER(    res) {           return (tryParse1("-", res) || 1) &&
       | (tryParse1("0", res) || tryParse1("123456789", res) &&
       | (tryParseDigits(res)||1)) &&             (tryParse1(".", res) ?
       | tryParseDigits(res) : 1) &&             (tryParse1("eE", res) ?
       | (tryParse1("-+",res)||1) && tryParseDigits(res) : 1) &&
       | asm("number") && asm(res[0])         }
       | 
       | why put yourself through this, when you can just do something
       | like this instead:                   package parse
       | import "strconv"                  func parse_float(s string)
       | (float64, error) {            return strconv.ParseFloat(s, 64)
       | }                  func parse_int(s string) (int64, error) {
       | return strconv.ParseInt(s, 10, 64)         }
        
         | donio wrote:
         | Not disagreeing with the overall point but that particular
         | example is from an AWK JSON parser implementation so the whole
         | point is to do it in AWK. If you look at the entire file it's
         | not too bad considering.
         | 
         | Funnily the actual Go JSON decoder code ends up doing something
         | similar during scanning:
         | 
         | https://github.com/golang/go/blob/master/src/encoding/json/d...
        
         | pmarreck wrote:
         | Depends on the AWK implementation, apparently.
         | bash> awk -v v="80.1%" 'BEGIN{print v+0.1}'         80.2
         | 
         | gawk has `strtonum`. But yes, parsing in awk generally looks
         | like a pain. With plain positive/negative ints though, not so
         | hard:                   echo "123456" | awk '{             if
         | ($0 ~ /^-?[0-9]+$/) {                 num = 0
         | sign = 1                 start = 1                 if
         | (substr($0, 1, 1) == "-") {                     sign = -1
         | start = 2                 }                 for (i = start; i
         | <= length($0); i++) {                     digit = substr($0, i,
         | 1)                     num = num * 10 + digit                 }
         | num = sign * num                 print "The integer is:", num
         | } else {                 print "Invalid input string:", $0
         | }         }'
        
         | czx4f4bd wrote:
         | As mentioned, the example you quoted is from a pure-AWK JSON
         | parser. I don't dispute that AWK has issues, but AWK is one of
         | those languages that magically coerces strings to numbers, so
         | you can just write `"1" + 2 + "3.5"` and it'll work.
        
       | elteto wrote:
       | Big, big fan of AWK. It sometimes feels like ancient, alien UNIX
       | technology to me. But lately I've been gravitating more and more
       | towards perl. You can write the same one liners (with perl -e and
       | friends), it has superb support for regexes and it's just a more
       | capable language (as expected, not bashing AWK).
        
         | pcwalton wrote:
         | You can use Ruby for this task too. I used to use Perl for
         | throwaway one-liners, but on advice I switched to Ruby because
         | of the bigger community and I'm pretty happy with it.
         | 
         | (Python isn't as nice for one-liner text processing, both
         | because of the lack of Awk heritage--so no built-in regex
         | syntax--and because of the indentation-based syntax requiring
         | newlines for most things.)
        
       | jalk wrote:
       | OMG never realized that $ is an infix operator - Plenty of times
       | where I needed something like $(NF-1) and instead used verbose
       | stuff like NF==5 { ... } NF==6 { ... }
        
         | jrochkind1 wrote:
         | I don't think you actually mean "infix"?
        
       | colonwqbang wrote:
       | > AWK was designed to not require a GC
       | 
       | ...
       | 
       | > The most substantial consequence is that it's forbidden to
       | return an array from a function, you can return only a scalar
       | value.
       | 
       | This doesn't make sense to me. Does someone understand what it
       | means?
       | 
       | In e.g. C++ a function can return an array without any GC or
       | refcounting, by "moving" the array into the caller's stack.
        
         | doctor_eval wrote:
         | I am not an expert but I'd say it's because Awk arrays are
         | associative; they are more like maps than slices, to use Go
         | terminology. And IIRC (it's been a while) the array values are
         | not strongly typed. So I think you could even say:
         | a[1] = "hello"       a["world"] = 2
         | 
         | That means that - unlike C arrays - Awk arrays are not a
         | simple, addressable byte range, but a complex data structure
         | with lots of pointers.
         | 
         | I suppose you could come up with a way to serialise the array
         | and pop it on the stack but that would be a lot of work, and
         | for the kind of things I use Awk for, the arrays would often be
         | huge.
        
         | [deleted]
        
       | eimrine wrote:
       | The author is really persuading me to learn awk because he use to
       | talk about the very reasons I avoid to do it as a faulty reasons,
       | and I consider his reasoning as decent.
        
       | version_five wrote:
       | I'm already a casual awk enthusiast but I'm really hoping to find
       | an opportunity to use it for a "real" software project soon. I've
       | been reading the gawk user manual, and suffice to say, the power
       | and features of the language is dramatically underutilized for
       | most of the things people normally do with it (my most common use
       | case is probably a hybrid of grep and cut)
       | 
       | https://www.gnu.org/software/gawk/manual/gawk.html
        
         | dc-programmer wrote:
         | I'm usually not a big side project guy, but I successfully used
         | AWK to solve a IRL problem last year. It really helped solidify
         | my understanding of the language.
         | 
         | The problem was that the Garmin GPS data for a bike ride I had
         | just completed had split into multiple rides. I used AWK to
         | stitch together the data into one file. I also did some basic
         | linear interpolation to fill in missing data points.
         | 
         | The GPS data is formatted as XML and I was able to parse it
         | fairly robustly using AWK.
        
           | account-5 wrote:
           | How did you parse XML with AWK? I would never think of using
           | AWK for XML data. I'd even stear clear of CSV data unless I
           | could guarantee no in field commas or newlines.
        
         | ufo wrote:
         | I find that I tend to use AWK for text munging tasks that are
         | too small to call a "project".
        
         | Arnavion wrote:
         | I wrote an IRC bot in it, one of those "paste a line of code
         | and the bot will evaluate it and print the result" bots that
         | you find in programming language channels. It's not a
         | particularly big or "real" project, but it definitely fulfills
         | the need of having a bot in that particular IRC channel.
         | 
         | awk is great for it because IRC (or at least the subset that
         | the bot cares about) is relatively easy to parse, and shelling
         | out the shell script that does the actual code evaluation and
         | prints the result back is also fairly straightforward. Someone
         | else used to have such a bot before but they had written it in
         | Rust with a bajillion dependencies; if I had done that I
         | would've had to update dependencies and redeploy it every other
         | week. In contrast I deployed my awk version once and then
         | basically haven't touched it in years.
        
       ___________________________________________________________________
       (page generated 2023-03-29 23:01 UTC)