[HN Gopher] K Isn't Lisp (2004)
       ___________________________________________________________________
        
       K Isn't Lisp (2004)
        
       Author : RodgerTheGreat
       Score  : 41 points
       Date   : 2024-02-12 17:13 UTC (1 days ago)
        
 (HTM) web link (nsl.com)
 (TXT) w3m dump (nsl.com)
        
       | theamk wrote:
       | That K solution is O(n^2) complexity, vs O(n*log(n)) for Lisp.
       | 
       | I kinda hate it how all the K advocacy always uses old-style
       | languages with small stdlibs, _and_ always ignores the
       | complexity.. How about matlab (which does have "unique" in
       | stdlib), python/numpy or even R?
        
         | rak1507 wrote:
         | I'm pretty sure the K solution is O(n)? Both group and unique
         | should be O(n).
        
           | theamk wrote:
           | Pretty sure it's impossible to calculate unique/group in
           | O(n), unless either the data is pre-sorted, or your range is
           | extremely limited (and problem says nothing about this), so
           | it has to be at least O(n _log(n))
           | 
           | For K, I think the solution is to find unique first
           | (n_log(n)), the for each unique value go over the list and
           | find how many input values match (multiply by n). That's
           | O(n^2 * log(n)), even worse than I wrote initially.
        
             | mlochbaum wrote:
             | No, the k solution computes the unique elements and group
             | independently. #:'=v[;1] uses group for the first column,
             | and ?v[;1] uses unique for the second. (;) is list
             | notation.
             | 
             | Your analysis of the algorithm you've described is also
             | wrong: it would be O(n*log(n) + n^2), or O(n^2), because
             | the second part loops over unique elements and not
             | operations performed by unique.
             | 
             | In the random-access model, hash table construction/lookup
             | is O(n) average case for n elements, if the elements have
             | bounded size. More generally, it's linear in the total size
             | of the input. If the strings are very long it can take a
             | while to hash and compare them, but there's never a log(n)
             | factor. Maybe you're confusing this with results from
             | comparison sorting?
             | 
             | Both solutions are O(n) with the assumptions of random-
             | access memory and bounded string size (or hash/comparison
             | time), which I consider to be pretty standard.
        
         | adw wrote:
         | Python has
         | https://docs.python.org/3/library/collections.html#collectio...
         | (which I'm sure would raise objections over the volume of code
         | -
         | https://github.com/python/cpython/blob/main/Lib/collections/...
         | - but a lot of it is providing syntactic sugar).
        
         | foomanmanfoo wrote:
         | Unsure of performance but using R's `data.table`(which has uses
         | beyond this example so installing this library shouldn't be a
         | one off)                  library(data.table)        v <-
         | rbindlist(list(            data.table(e1=1, string='one'),
         | data.table(e1=1, string='two'),            data.table(e1=1,
         | string='three'),            data.table(e1=1, string='one'),
         | data.table(e1=1, string='four'),            data.table(e1=1,
         | string='two'))         )        v[, list(counter= sum(e1)),by=
         | string]
        
       | tmtvl wrote:
       | Doing the naive thing in CL without thinking about things like
       | 'efficiency' or 'maintainability':                 (let ((test-
       | input '((1 "one") (1 "two") (1 "three") (1 "one") (1 "four") (1
       | "two"))))         (mapcar (lambda (string)
       | (list (count string test-input :test #'string= :key #'second)
       | string))           (remove-duplicates (mapcar #'second test-
       | input)                              :test #'string= :from-end
       | t)))
       | 
       | I will admit it's not as short as:                 v:((1;"one");(
       | 1;"two");(1;"three");(1;"one");(1;"four");(1;"two"))
       | +(#:'=v[;1];?v[;1])
        
         | SeanLuke wrote:
         | Now you have me thinking about how short I can make that. My
         | revision:                   (defun doit (elts)
         | (mapcar (lambda (elt) `(,(count elt elts :test 'equal :key
         | 'second) ,elt))             (delete-duplicates (mapcar 'second
         | elts) :test 'equal)))))
         | 
         | or maybe                   (defun doit (elts)             (loop
         | for elt in (delete-duplicates (mapcar 'second elts) :test
         | 'equal)             collect `(,(count elt elts :test 'equal
         | :key 'second) ,elt)))
        
           | tmtvl wrote:
           | As much as Lisp isn't meant for code golf, I think the loop
           | variation could be used as a (rather long) one-liner.
        
         | taeric wrote:
         | My naive looked more like: (with apologies on likely messing up
         | spacing in hn.)                   (let ((in '((1 "one") (1
         | "two") (1 "three") (1 "one") (1 "four") (1 "two"))))
         | (loop with hash = (make-hash-table :test #'equal)
         | for v in in                      do (incf (gethash (cadr v)
         | hash 0))                      finally (return (loop for key
         | being the hash-keys of hash
         | collect (list (gethash key hash) key)))))
        
       | gweinberg wrote:
       | If they wanted to have a language that isn't lisp but whose name
       | started with a K, wouldn't it have made more sense to call it
       | kil?
        
         | 7thaccount wrote:
         | K is in the APL family of languages and thus not really related
         | to lisp. The language inventor wrote A+ prior to K. He also
         | assisted with a prototype for the J language (look for the J
         | incunabulum to see), so K is a natural next step.
         | 
         | https://www.jsoftware.com/ioj/iojATW.htm
        
           | ksherlock wrote:
           | If you look the original lisp (ie, M-expressions) and replace
           | the linked lists with vectors it's not that far off.
        
       | aidenn0 wrote:
       | It's trivial to extend the SERIES library to allow collecting
       | sums in a hash-table (It was just a cut-paste of the collect-hash
       | function with setf changed to incf; further extending to allow
       | generic modifications of values is an exercise left to the
       | reader). That allows a shorter (but not as short as I'd like)
       | implementation. It could be shorter if the output list were to
       | have elements like ("one" 1) as the multiple-value-bind exists to
       | reverse that ordering for unpacking the hash-table at the end.
       | 
       | Here's the result (if you wanted to further optimize and know
       | that the result will fit in a fixnum (i.e. +/- 2^62 on modern
       | implementations, you can change the "'number" to "'fixnum"):
       | (defun unique-sum (list)         (multiple-value-bind (keys
       | values)             (scan-hash              (collect-hash-sum
       | (map-fn 'string #'cadr (scan list))               (map-fn 'number
       | #'car (scan list))               :test #'equal))           (map-
       | fn 'list #'list values keys)))
       | 
       | And the extension to SERIES:                 (series::defS
       | collect-hash-sum (keys values &rest option-plist)
       | "(collect-hash keys values :test :size :rehash-size :rehash-
       | threshold)              Combines a series of keys and a series of
       | values together into a hash       table.  Duplicate keys sum the
       | values. The keyword arguments specify       the attributes of the
       | hash table to be produced.  They are used as       arguments to
       | MAKE-HASH-TABLE"         (fragl ((keys t) (values t) (option-
       | plist))         ((table))         ((table t (apply #'make-hash-
       | table option-plist)))         ()                ()
       | ((incf (gethash keys table 0) values)) () () nil)
       | :optimizer         (series::apply-literal-frag           (list
       | '(((keys t) (values t) (table)) ((table)) () ()
       | () ((incf (gethash keys table 0) values)) () () nil)
       | keys values `(make-hash-table ,@ option-plist)))        :trigger
       | t)
        
       | aidenn0 wrote:
       | Does the K solution handle non-unity values in the beginning of
       | each sublist? e.g. ((1 "one")(2 "one")) should result in ((3
       | "one"))
        
       | camdez wrote:
       | The exact semantics of the problem aren't clear (do we need to
       | maintain order? Do we need to use the initial count? Why do we
       | have counts on subsequent elements if we only add 1?), but the
       | logical Clojure solution would just be:
       | (frequencies (map second v))       ;; => {"one" 2, "two" 2,
       | "three" 1, "four" 1}            ;; or:            (update-vals
       | (group-by second v) count)       ;; => {"one" 2, "two" 2, "three"
       | 1, "four" 1}            ;; +(#:'=v[;1];?v[;1])       ;;
       | flip(count each group v[;1];unique v[;1])
       | 
       | So, shorter than the readable K version in a Lisp.
       | 
       | What did we learn here? Probably nothing.
       | 
       | K is great for sequence operations--not sure what we're trying to
       | imply about Lisps.
        
       | asa400 wrote:
       | awk:                 echo '1 "one"\n1 "two"\n1 "three"\n1
       | "one"\n1 "four"\n1 "two"' | awk '{ counters[$2] += $1 } END { for
       | (id in counters) { print counters[id], id } }'            2 "two"
       | 1 "four"       1 "three"       2 "one"
        
       ___________________________________________________________________
       (page generated 2024-02-13 23:01 UTC)