[HN Gopher] K Isn't Lisp (2004)
___________________________________________________________________
K Isn't Lisp (2004)
Author : RodgerTheGreat
Score : 41 points
Date : 2024-02-12 17:13 UTC (1 days ago)
(HTM) web link (nsl.com)
(TXT) w3m dump (nsl.com)
| theamk wrote:
| That K solution is O(n^2) complexity, vs O(n*log(n)) for Lisp.
|
| I kinda hate it how all the K advocacy always uses old-style
| languages with small stdlibs, _and_ always ignores the
| complexity.. How about matlab (which does have "unique" in
| stdlib), python/numpy or even R?
| rak1507 wrote:
| I'm pretty sure the K solution is O(n)? Both group and unique
| should be O(n).
| theamk wrote:
| Pretty sure it's impossible to calculate unique/group in
| O(n), unless either the data is pre-sorted, or your range is
| extremely limited (and problem says nothing about this), so
| it has to be at least O(n _log(n))
|
| For K, I think the solution is to find unique first
| (n_log(n)), the for each unique value go over the list and
| find how many input values match (multiply by n). That's
| O(n^2 * log(n)), even worse than I wrote initially.
| mlochbaum wrote:
| No, the k solution computes the unique elements and group
| independently. #:'=v[;1] uses group for the first column,
| and ?v[;1] uses unique for the second. (;) is list
| notation.
|
| Your analysis of the algorithm you've described is also
| wrong: it would be O(n*log(n) + n^2), or O(n^2), because
| the second part loops over unique elements and not
| operations performed by unique.
|
| In the random-access model, hash table construction/lookup
| is O(n) average case for n elements, if the elements have
| bounded size. More generally, it's linear in the total size
| of the input. If the strings are very long it can take a
| while to hash and compare them, but there's never a log(n)
| factor. Maybe you're confusing this with results from
| comparison sorting?
|
| Both solutions are O(n) with the assumptions of random-
| access memory and bounded string size (or hash/comparison
| time), which I consider to be pretty standard.
| adw wrote:
| Python has
| https://docs.python.org/3/library/collections.html#collectio...
| (which I'm sure would raise objections over the volume of code
| -
| https://github.com/python/cpython/blob/main/Lib/collections/...
| - but a lot of it is providing syntactic sugar).
| foomanmanfoo wrote:
| Unsure of performance but using R's `data.table`(which has uses
| beyond this example so installing this library shouldn't be a
| one off) library(data.table) v <-
| rbindlist(list( data.table(e1=1, string='one'),
| data.table(e1=1, string='two'), data.table(e1=1,
| string='three'), data.table(e1=1, string='one'),
| data.table(e1=1, string='four'), data.table(e1=1,
| string='two')) ) v[, list(counter= sum(e1)),by=
| string]
| tmtvl wrote:
| Doing the naive thing in CL without thinking about things like
| 'efficiency' or 'maintainability': (let ((test-
| input '((1 "one") (1 "two") (1 "three") (1 "one") (1 "four") (1
| "two")))) (mapcar (lambda (string)
| (list (count string test-input :test #'string= :key #'second)
| string)) (remove-duplicates (mapcar #'second test-
| input) :test #'string= :from-end
| t)))
|
| I will admit it's not as short as: v:((1;"one");(
| 1;"two");(1;"three");(1;"one");(1;"four");(1;"two"))
| +(#:'=v[;1];?v[;1])
| SeanLuke wrote:
| Now you have me thinking about how short I can make that. My
| revision: (defun doit (elts)
| (mapcar (lambda (elt) `(,(count elt elts :test 'equal :key
| 'second) ,elt)) (delete-duplicates (mapcar 'second
| elts) :test 'equal)))))
|
| or maybe (defun doit (elts) (loop
| for elt in (delete-duplicates (mapcar 'second elts) :test
| 'equal) collect `(,(count elt elts :test 'equal
| :key 'second) ,elt)))
| tmtvl wrote:
| As much as Lisp isn't meant for code golf, I think the loop
| variation could be used as a (rather long) one-liner.
| taeric wrote:
| My naive looked more like: (with apologies on likely messing up
| spacing in hn.) (let ((in '((1 "one") (1
| "two") (1 "three") (1 "one") (1 "four") (1 "two"))))
| (loop with hash = (make-hash-table :test #'equal)
| for v in in do (incf (gethash (cadr v)
| hash 0)) finally (return (loop for key
| being the hash-keys of hash
| collect (list (gethash key hash) key)))))
| gweinberg wrote:
| If they wanted to have a language that isn't lisp but whose name
| started with a K, wouldn't it have made more sense to call it
| kil?
| 7thaccount wrote:
| K is in the APL family of languages and thus not really related
| to lisp. The language inventor wrote A+ prior to K. He also
| assisted with a prototype for the J language (look for the J
| incunabulum to see), so K is a natural next step.
|
| https://www.jsoftware.com/ioj/iojATW.htm
| ksherlock wrote:
| If you look the original lisp (ie, M-expressions) and replace
| the linked lists with vectors it's not that far off.
| aidenn0 wrote:
| It's trivial to extend the SERIES library to allow collecting
| sums in a hash-table (It was just a cut-paste of the collect-hash
| function with setf changed to incf; further extending to allow
| generic modifications of values is an exercise left to the
| reader). That allows a shorter (but not as short as I'd like)
| implementation. It could be shorter if the output list were to
| have elements like ("one" 1) as the multiple-value-bind exists to
| reverse that ordering for unpacking the hash-table at the end.
|
| Here's the result (if you wanted to further optimize and know
| that the result will fit in a fixnum (i.e. +/- 2^62 on modern
| implementations, you can change the "'number" to "'fixnum"):
| (defun unique-sum (list) (multiple-value-bind (keys
| values) (scan-hash (collect-hash-sum
| (map-fn 'string #'cadr (scan list)) (map-fn 'number
| #'car (scan list)) :test #'equal)) (map-
| fn 'list #'list values keys)))
|
| And the extension to SERIES: (series::defS
| collect-hash-sum (keys values &rest option-plist)
| "(collect-hash keys values :test :size :rehash-size :rehash-
| threshold) Combines a series of keys and a series of
| values together into a hash table. Duplicate keys sum the
| values. The keyword arguments specify the attributes of the
| hash table to be produced. They are used as arguments to
| MAKE-HASH-TABLE" (fragl ((keys t) (values t) (option-
| plist)) ((table)) ((table t (apply #'make-hash-
| table option-plist))) () ()
| ((incf (gethash keys table 0) values)) () () nil)
| :optimizer (series::apply-literal-frag (list
| '(((keys t) (values t) (table)) ((table)) () ()
| () ((incf (gethash keys table 0) values)) () () nil)
| keys values `(make-hash-table ,@ option-plist))) :trigger
| t)
| aidenn0 wrote:
| Does the K solution handle non-unity values in the beginning of
| each sublist? e.g. ((1 "one")(2 "one")) should result in ((3
| "one"))
| camdez wrote:
| The exact semantics of the problem aren't clear (do we need to
| maintain order? Do we need to use the initial count? Why do we
| have counts on subsequent elements if we only add 1?), but the
| logical Clojure solution would just be:
| (frequencies (map second v)) ;; => {"one" 2, "two" 2,
| "three" 1, "four" 1} ;; or: (update-vals
| (group-by second v) count) ;; => {"one" 2, "two" 2, "three"
| 1, "four" 1} ;; +(#:'=v[;1];?v[;1]) ;;
| flip(count each group v[;1];unique v[;1])
|
| So, shorter than the readable K version in a Lisp.
|
| What did we learn here? Probably nothing.
|
| K is great for sequence operations--not sure what we're trying to
| imply about Lisps.
| asa400 wrote:
| awk: echo '1 "one"\n1 "two"\n1 "three"\n1
| "one"\n1 "four"\n1 "two"' | awk '{ counters[$2] += $1 } END { for
| (id in counters) { print counters[id], id } }' 2 "two"
| 1 "four" 1 "three" 2 "one"
___________________________________________________________________
(page generated 2024-02-13 23:01 UTC)