[HN Gopher] A brief interview with Awk creator Dr. Brian Kernigh...
       ___________________________________________________________________
        
       A brief interview with Awk creator Dr. Brian Kernighan (2022)
        
       Author : breck
       Score  : 109 points
       Date   : 2024-07-17 18:28 UTC (4 hours ago)
        
 (HTM) web link (pldb.io)
 (TXT) w3m dump (pldb.io)
        
       | fuzzy_biscuit wrote:
       | I know the title said brief, but it still took me by surprise
       | that there were only three questions.
        
         | phatfish wrote:
         | No excuses not to read the whole article before commenting this
         | time!
        
       | kansai wrote:
       | Describing the K of K&R as "Awk creator" is like describing
       | Einstein as a "refrigerator engineer".
        
         | overhead4075 wrote:
         | It's an awk interview. "Awk creator" is relevant.
        
           | midiguy wrote:
           | I agree it was an incredibly awkward interview
        
           | riiii wrote:
           | He didn't say it was irrelevant.
        
           | felideon wrote:
           | Right, the headline gave the appropriate context for the
           | interview in a short sentence. Not everyone knows Kernighan
           | created awk, as opposed to a respected person that happens to
           | have opinions on awk.
        
             | Gormo wrote:
             | AWK's name is actually an initialism of the last names of
             | its three creators: Aho, Weinberger, and Kernighan.
        
               | usr1106 wrote:
               | Aho, one of the authors of the dragon books. When I
               | studied CS 40 years ago it was given standard literature
               | on compiler construction, a mandatory course.
               | 
               | No idea whether it is still used today. Well, no idea
               | whether there is anything fundamentally new in compilers
               | such an old book would not cover.
        
               | ryangs wrote:
               | Indeed, still in use. Also presumably the Aho of the Aho-
               | Corasick string matching algorithm.
        
         | JohnFen wrote:
         | Why do you say that? developing Awk was no less of an
         | accomplishment than C.
        
           | alexthehurst wrote:
           | This seems like a rather large claim that would benefit from
           | some kind of supporting argument.
        
             | kragen wrote:
             | c-descended languages include c++, java, and c#
             | 
             | awk-descended languages include perl, tcl, js, python and
             | lua
        
         | 1234554321a wrote:
         | He's the closest man to C still alive. Yet it's not really
         | "his" accomplishment. C was created by Ritchie alone, Kernighan
         | only wrote the book.
        
           | aap_ wrote:
           | I would say Ken Thompson and Steve Johnson are the closer.
        
         | audleman wrote:
         | I read this comment in the spirit of "you might not have known
         | that K did much more that write awk, but he's a genius." Which
         | I didn't know, so I appreciate the context.
        
       | Zambyte wrote:
       | It's interesting to me that he refers to association arrays as
       | "newish", when they showed up in Lisp nearly 20 years earlier.
        
         | chasil wrote:
         | This is from the first edition free PDF online of _The AWK
         | Programming Language_ :
         | 
         | "Associative arrays were inspired by SNOBOL4 tables, although
         | they are not as general. Awk was born on a slow machine with a
         | small memory, and the properties of arrays were a result of
         | that environment. Restricting subscripts to be strings is one
         | manifestation, as is the restriction to a single dimension
         | (even with syntactic sugar). A more general implementation
         | would allow multi-dimensional arrays, or at least allow arrays
         | to be array elements."
         | 
         | Stable release - SNOBOL4 / 1967
         | 
         | https://en.wikipedia.org/wiki/SNOBOL
        
         | jerf wrote:
         | Two things to keep in mind: Pre-internet, it was quite easy to
         | be skilled and knowledgeable and yet never have been exposed to
         | something another community somewhere would consider basic, or
         | for it to take many many years to percolate across. I remember
         | getting into programming even in the 1990s and I can tell you a
         | lot of the C world would still consider associative data
         | structures like a hash table something to be a bit exotic and
         | only to be pulled out if you had no other choice. It was very
         | easy to not be very up to date on these things.
         | 
         | Second, while the programming world fancies itself a very fast
         | mover, it really isn't, and it especially wasn't back then. 20
         | years isn't necessarily as long as you would think in 2024.
         | We're still trying to convince people in 2024 that you really
         | don't need a language that will happily index outside of
         | arrays, and how long has _that_ been a staple in existing
         | languages? (You may think that 's silly, but we've had that
         | fight on HN, yes, yea verily, this very year. Granted, this is
         | winding down, but it's still a live argument.)
         | 
         | There's a lot of things that showed up in some language in the
         | 1960s or 1970s and wasn't widely adopted for another few
         | decades.
         | 
         | (Though I suspect one problem with Lisp is that back in the
         | day, its performance was pretty awful relative to the other
         | things of the day. It was not always the best advertisement for
         | its features. A bit too ahead of its time sometimes.)
        
           | Zambyte wrote:
           | I totally get the first point, but this comment from Brian
           | was made when the internet had been available for many, many
           | years. It seems more like it felt newish at the time, and
           | that is how he has seen it since then. I guess that's more
           | why I find it to be an interesting comment.
           | 
           | Regarding the second point, when I hear the term "associative
           | arrays" or "associative lists" I think of them in a pretty
           | high level of abstraction. Like really: a C struct is an
           | association abstraction. A simple array of structs is not
           | often considered an "associative array", but it can easily be
           | used for the same things.
           | 
           | Maybe it would be more accurate to say that the way they were
           | using associative arrays way newish, rather than the
           | construct itself.
        
         | kragen wrote:
         | in a sense lisp alists are 'associative arrays', sure. but,
         | although they were used in the metacircular interpreter that
         | bootstrapped lisp and still inspires us today, they aren't a
         | language feature, and the lisp ones aren't efficient enough for
         | the things you use them for in awk
         | 
         | you can of course build an associative array in any language,
         | especially if you're satisfied with the linear search used by
         | lisp alists. you can hack one together in a few minutes almost
         | without thinking:                           .intel_syntax
         | noprefix         lookup: push rbx                 push rbp
         | mov rbx, rdi    # pointer to alist node pointer
         | mov rbp, rsi    # key             2:  mov rdi, [rbx]  # load
         | pointer to alist node                 test rdi, rdi   # is it
         | null?                 jnz 3f          # if not null, skip not-
         | found case                 mov rax, rbx    # pointer to null
         | pointer is return value                 xor rdx, rdx    # but
         | associated value is null (sets zero flag)                 jmp
         | 4f          # jump to shared epilogue             3:  mov rdi,
         | [rdi]  # load pointer from key field of alist node
         | mov rsi, rbp    # reload key from callee-saved register
         | call comkey     # sets zero flag if rsi and rdi are equal keys
         | jne 1f          # skip following return-value case code if !=
         | mov rax, rbx    # pointer to node pointer is return value
         | mov rdx, [rbx]  # also let's follow that pointer to the node
         | mov rdx, [rdx + 16]    # and return its value field too in rdx
         | test rax, rax   # clear zero flag (valid pointer is not null)
         | 4:  pop rbp                 pop rbx                 ret
         | 1:  mov rbx, [rbx]  # load pointer to alist node again
         | add rbx, 8      # load pointer to next-node pointer field
         | jmp 2b
         | 
         | (untested, let me know if you find bugs)
         | 
         | but that's very different from having them built in as a
         | language feature, like snobol4, mumps, awk, perl, python, tcl,
         | lua, and js do. the built-in language feature lisp had for
         | structuring data 20 years earlier was cons, car, and cdr, which
         | is not at all the same thing
         | 
         | other programs on unix that contained associative arrays prior
         | to awk included the linker (for symbols), the c compiler, the
         | assembler, the kernel (in the form of directories in the
         | filesystem), and the shell, for shell variables. none of these
         | had associative arrays as a programming language feature,
         | though. the bourne shell came closest in that you could
         | concatenate variable names and say things like
         | eval myarray_$key=$val
         | 
         | here's the implementation of binary tree traversal for setting
         | shell variables in the v7 bourne shell from 01979
         | NAMPTR          lookup(nam)                 REG STRING
         | nam;         {                 REG NAMPTR      nscan=namep;
         | REG NAMPTR      *prev;                 INT             LR;
         | IF !chkid(nam)                 THEN    failed(nam,notid);
         | FI                 WHILE nscan                 DO      IF
         | (LR=cf(nam,nscan->namid))==0                         THEN
         | return(nscan);                         ELIF LR<0
         | THEN    prev = &(nscan->namlft);                         ELSE
         | prev = &(nscan->namrgt);                         FI
         | nscan = *prev;                 OD                      /* add
         | name node */                 nscan=alloc(sizeof *nscan);
         | nscan->namlft=nscan->namrgt=NIL;
         | nscan->namid=make(nam);                 nscan->namval=0;
         | nscan->namflg=N_DEFAULT; nscan->namenv=0;
         | return(*prev = nscan);         }
         | 
         | if you're not sure, that's c (now you know where the ioccc came
         | from)
        
           | foobarian wrote:
           | I love that Bourne abused C macros to make sh source look
           | like Algol. Lots of references to this elsewhere but this is
           | my favorite:
           | 
           | https://www.tuhs.org/cgi-
           | bin/utree.pl?file=V7/usr/src/cmd/sh...
        
       | FelipeCortez wrote:
       | There's a more comprehensive interview that also includes Aho and
       | Weinberger in the book Masterminds of Programming. Highly
       | recommended
        
         | chasil wrote:
         | His wiki lists several interviews, for those inclined:
         | 
         | https://en.wikipedia.org/wiki/Brian_Kernighan
         | 
         | http://www.linuxjournal.com/article/7035
         | 
         | http://www-2.cs.cmu.edu/~mihaib/kernighan-interview/index.ht...
         | 
         | https://web.archive.org/web/20090428163341/https://www.princ...
         | 
         | https://web.archive.org/web/20131126220450/http://princetons...
        
       | JonChesterfield wrote:
       | I'm very taken with the regex to lex to yacc to awk sequence of
       | developments. There's a very convincing sense of building on
       | prior work to achieve more general results.
        
       | kragen wrote:
       | i've been reading _the unix programming environment_ from 01983
       | this week (the old testament of kernighan and pike), about 35
       | years later than i really should have. awk is really one of the
       | stars of the book, the closest thing in it to currently popular
       | languages like js, lua, python, perl, or tcl. (what awk calls
       | 'associative arrays' js just calls 'objects'.)
       | 
       | the 7th edition unix version of awk from 01979
       | https://www.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/aw...
       | (from
       | https://www.tuhs.org/Archive/Distributions/Research/Henry_Sp...)
       | is only 2680 lines of source code, which is pretty astonishing.
       | the executable is 46k and ran in the pdp-11's 64k address space.
       | as far as i can tell, that version of awk, and also the one
       | documnted four years later in _tupe_ , didn't even have user-
       | defined functions. neither did the v7 version of the bourne
       | shell, i think
       | 
       | bc, however, did
        
         | mananaysiempre wrote:
         | > that version of awk [...] didn't even have user-defined
         | functions
         | 
         | Also known as "old awk" or (retronymically) "oawk". Heirloom
         | still carries[1] what should be a descendant of it (alongside
         | the more familliar nawk).
         | 
         | Also,
         | 
         | > the executable is 46k
         | 
         | thus still a bit larger than Turbo Pascal 3.02[2] :) That
         | didn't have a regex engine, admittedly.
         | 
         | [1] https://heirloom.sourceforge.net/man/awk.1.html
         | 
         | [2] https://prog21.dadgum.com/116.html
        
           | coliveira wrote:
           | Turbo Pascal was written in asm and highly optimized to
           | reduce its size.
        
         | tannhaeuser wrote:
         | > _the closest thing [...] to currently popular languages like
         | js_
         | 
         | awk is more than that: having introduced the "function"
         | keyword, for e in a, semicolon-less syntax, stringly-typing,
         | delete x[e], etc, etc, it's quite obviously the base for
         | JavaScript, and Brendan Eich said as much [1].
         | 
         | [1]: https://brendaneich.com/2010/07/a-brief-history-of-
         | javascrip...
        
           | kragen wrote:
           | i'd say perl is closer, but for (e in a) is from awk and not
           | perl
           | 
           | semicolonless syntax is in js, tcl, lua, and sh, but not awk
           | or perl; in awk omitted semicolons are interpreted as string
           | concatenation
        
         | dakiol wrote:
         | I read the same book and got the exact same conclusion. Then I
         | read The UNIX--Haters Handbook and thought that Unix tools
         | (like awk) are sadly not good enough anymore.
        
           | kragen wrote:
           | yeah, there are a lot of questionable design decisions in
           | there that are only defensible in the context of the pdp-11's
           | 64k address space and slow cpu
        
       | gregw2 wrote:
       | Little known random Brian Kernighan facts:
       | 
       | * He joined Princeton's CS department in 2000 but taught at least
       | one class there as early as 1993 while still at Bell Labs
       | Research (on sabbatical?)
       | 
       | * One of his students regularly brought a 386sx laptop (running
       | pre-1.0 Linux) to class and when Brian was asked more obscure
       | questions about what awk did which he couldn't remember, the
       | student would run commands in awk and feed Brian the definitive
       | implementation answer. So Brian had some exposure to Linux
       | moderately early on.
       | 
       | * Here's a writeup from him on putting AT&T's toll free phone
       | directory on the internet back in fall 1994:
       | https://www.cs.princeton.edu/~bwk/800.html
        
       ___________________________________________________________________
       (page generated 2024-07-17 23:01 UTC)