[HN Gopher] KlongPy: High-Performance Array Programming in Python
       ___________________________________________________________________
        
       KlongPy: High-Performance Array Programming in Python
        
       Author : cl3misch
       Score  : 97 points
       Date   : 2024-12-02 06:40 UTC (16 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | solidsnack9000 wrote:
       | I gather where it says `:monad` it is referring to an operation
       | that had an effect on the interpreter state?
        
         | Pompidou wrote:
         | No. In APL deriverd array programming languages, verbs (or
         | functions) are monadic or dyadic : they accept only one or two
         | arguments :
         | 
         | In '1 + 1', + is a dyadic operator, while in 'exp 5', exp is
         | monadic.
         | 
         | In J, and in APL I guess, left arg is usually understood as
         | 'control data', while right arg is the data upon which
         | calculation is done. Left argument is usually left unchanged
         | after calculations.
         | 
         | In this way, is it possible to create multi-arguments verbs, by
         | placing boxed args on the left of the verb.
        
           | itishappy wrote:
           | It sounds unrelated to the monads popularized by Haskell
           | tutorials. Naming coincidence or am I missing something?
        
             | odo1242 wrote:
             | Naming coincidence, they are in fact unrelated.
        
             | skruger wrote:
             | APL's use of 'monad' predates Haskell's existence by
             | decades.
        
               | 082349872349872 wrote:
               | ...and Leibniz's "windowless monads" (all three concepts
               | are pairwise distinct, related solely by the greek prefix
               | for 'single') predates APL and Haskell by centuries.
        
             | solidsnack9000 wrote:
             | A `monad` in Category Theory (which is where Haskell gets
             | them from, ultimately) is a single-argument functor and two
             | natural transformations, so there probably is a through-
             | line here in terms of one-argument function-likes.
        
           | solidsnack9000 wrote:
           | Thanks kindly.
        
       | incrudible wrote:
       | Naive array programming is not really high performance in my
       | book, because performing a lot of trivial arithmetic on large
       | arrays leads to poor locality and high bandwidth pressure. The
       | better alternative is SPMD, i.e. something like CUDA or ISPC for
       | CPUs. This is possible with some type of JIT if the numpy style
       | of programming is to be maintained, for example tinygrad.
        
         | cl3misch wrote:
         | If you like high-performance array programming a la "numpy with
         | JIT" I suggest looking at JAX. It's very suitable for general
         | numeric computing (not just ML) and a very mature ecosystem.
         | 
         | https://github.com/jax-ml/jax
        
         | CyberDildonics wrote:
         | You are absolutely right, naive array programming might be much
         | faster than raw python but it will never be high performance
         | because you can't use caches and memory bandwidth effectively.
        
           | bunderbunder wrote:
           | The unstated major premise here is that you're working with
           | data that's big enough that this becomes your major
           | bottleneck.
           | 
           | There's also a subset - possibly even a silent majority - of
           | problems where you're doing fairly intensive calculation on
           | data that fairly comfortably fits into a modern CPU cache.
           | For those, I'd still expect SIMD to be a great choice from a
           | performance perspective.
        
             | CyberDildonics wrote:
             | I'm not exactly sure what this means, but the bandwidth to
             | the level 3 cache isn't actually more than memory. If you
             | only have a few megabytes to work with, it's a lot less
             | likely performance is going to matter anyway, and if it
             | does you would still want to do multiple operations on each
             | array item if possible, because if you put a few more lines
             | in a loop you're talking about registers at that point, not
             | cache.
        
               | dzaima wrote:
               | L3 cache bandwidth is often more than main memory; most
               | systems at https://old.chipsandcheese.com/memory-
               | bandwidth-data/ give at least a ~2x difference from
               | whatever sample I clicked.
        
               | bunderbunder wrote:
               | And on a recent generation Xeon, your mid-level data
               | cache can total up to 64MB, depending on model. That's
               | maybe the more interesting one? L3 cache is shared among
               | cores, so access to it can be more heavily impacted by
               | memory concurrency concerns and not just pure bandwidth
               | concerns. Meaning there's potentially a strong incentive
               | to keep your task size below 2MB per core.
        
               | CyberDildonics wrote:
               | What are doing that is computationally intensive and are
               | using python and a CPU with 64MB of cache for, but is
               | under 64 MB total? How do you know you aren't losing
               | performance by doing single operations on each array in
               | python?
        
               | bunderbunder wrote:
               | Model training and inference, for example. Despite the
               | hype, there's actually a lot of data science work that
               | uses modestly-sized sets of structured data and doesn't
               | really need a billion parameter deep learning model. And
               | even when you're running inference on larger datasets
               | it's often fine to just use mini batches.
               | 
               | I don't really think of it as a "losing" performance
               | problem. It's all about tradeoffs. SIMD tooling often
               | offers an easier programming model than SPMD tooling, and
               | spending an extra 60 minutes of development time to save
               | myself 60 seconds of run time is generally not a
               | worthwhile trade in my problem space. Especially if I
               | also need to run it on more expensive compute instances
               | to realize that speedup.
        
               | CyberDildonics wrote:
               | It seems like instead of a "silent majority doing
               | intensive calculations" what you're really saying is "I
               | don't think of slower speeds as a performance loss, I
               | don't pay attention to that, I don't care about slower
               | speeds, my data is small and my programs run almost
               | instantly".
               | 
               | If someone says "this isn't really high performance" and
               | you say "I don't personally care, my stuff is small and
               | doesn't take long time to run" that isn't a coherent
               | reply.
        
         | mlochbaum wrote:
         | Memory traffic is certainly the biggest problem that the array
         | paradigm presents for implementation, yes. I'd quibble with
         | calling that "poor locality": when working with large arrays,
         | if any part of a cache line is accessed it's very likely the
         | entire line will be used in short order, which is the
         | definition of good locality as I understand it. The issue is
         | simply high memory use and a large number of accesses.
         | 
         | I think it's an oversimplification to say SPMD is a better
         | model full stop. Array programming has the advantage that the
         | implementer optimizes a specific function, and its interaction
         | with the rest of the program is much simpler: all the data is
         | available at the start, and the result can be produced in any
         | order. So things like multi-pass algorithms and temporary
         | lookup tables are possible, particularly valuable if the
         | program isn't spending that much time on arithmetic but rather
         | "heavy" searching and sorting operations. Not that I claim
         | NumPy or CuPy do a good job of this. Sections "Fusion versus
         | fission" and "Dynamic versus static" are relevant here, I
         | think:
         | https://mlochbaum.github.io/BQN/implementation/versusc.html#...
        
       | kstrauser wrote:
       | This is awesome. I've wanted to play with array languages before
       | but they tend to be kind of a pain in the neck to start with
       | locally. I like the idea of hacking my way around in the new
       | language while keeping a Python escape hatch close by in case I
       | haven't yet learned how to do a thing the new way.
       | 
       | Nice.
        
       | MPSimmons wrote:
       | I thought I knew python, but then I saw this:
       | ?> sum::{+/x}  :" sum + over / the array x         :monad
       | ?> sum([1 2 3])         6         ?> count::{#x}         :monad
       | ?> count([1 2 3])         3
       | 
       | What?
        
         | MPSimmons wrote:
         | oooooooh I think I'm starting to see...
         | 
         | sum::{} is the creation of a monadic function (which I just
         | learned about because I don't have a CS background)
         | 
         | :" is a comment
         | 
         | sum([...]) is calling the function with a list, and the
         | function will iterate over the list adding each of the elements
         | 
         | count::{#x} is another monadic function that returns the number
         | of elements, which then gets called
        
           | gshulegaard wrote:
           | I just briefly reviewed the README docs, I believe the
           | KlongPy is a custom language which transpiles to Python. The
           | REPL block you are trying to interpret is the KlongPy REPL
           | not a Python one.
           | 
           | Embedding KlongPy in a Python block would look more like this
           | (also from the docs):                   from klongpy import
           | KlongInterpreter         import numpy as np              data
           | = np.array([1, 2, 3, 4, 5])         klong =
           | KlongInterpreter()         # make the data NumPy array
           | available to KlongPy code by passing it into the interpreter
           | # we are creating a symbol in KlongPy called 'data' and
           | assigning the external NumPy array value
           | klong['data'] = data         # define the average function in
           | KlongPY         klong('avg::{(+/x)%#x}')         # call the
           | average function with the external data and return the
           | result.         r = klong('avg(data)')         print(r) #
           | expected value: 3
           | 
           | Note the calls to "klong('<some-str-of-klong-syntax')".
        
           | itishappy wrote:
           | I'm new too, so take my opinion with a grain of salt, but I
           | think you're right with one minor correction:
           | name::{body} :" function declaration
           | 
           | This isn't necessarily niladic or monadic or dyadic on it's
           | own, it depends on the body. The bodies from the example all
           | just happened to be monadic.                   +/x
           | :" monadic : sum over argument x         #x           :"
           | monadic : count of the argument x         +            :"
           | dyadic  : sum         !10          :" niladic : an array from
           | 0 to 9
           | 
           | I had to look up how calling a dyadic function looks in
           | Klong:                   add::{+}         add(2;3)
        
             | MPSimmons wrote:
             | Man, that's so weird. Thanks for the clarification.
        
         | nils-m-holm wrote:
         | The langiage in the examples is Klong, not Python. KlongPy
         | translates Klong to Python and then executes it.
        
         | asloan23 wrote:
         | Sorry if this is a dumb question, why is this weird in Python?
         | I use R as my main language and this is exactly how I would
         | expect it to work. Does Python not do this naturally?
        
           | faizshah wrote:
           | > sum::{+/x}
           | 
           | > count::{#x}
           | 
           | These two expressions are not valid in R or Python. They are
           | implicitly declaring a function without declaring the
           | arguments, it is implicitly iterating over the array and
           | returning the result (all features of array languages to make
           | it more concise).
           | 
           | Python's + and / are a function of two objects (dyadic) and
           | are syntactic sugar (they can't be treated as function
           | values, thats what the operator module is for): https://docs.
           | python.org/3/reference/datamodel.html#object.__...
           | 
           | The array languages optimize for concise code so a lot of
           | things are done implicitly that are done explicitly in
           | Python.
           | 
           | The python equivalent:
           | 
           | > _sum = lambda x : functools.reduce(operator.add, x)
           | 
           | > _count = lambda x: functools.reduce(lambda acc, _: acc + 1,
           | x, 0)
           | 
           | Of course in python and R there are built ins for these
           | already (sum(x) and len(x)) but this is just to show what
           | array languages are doing for you.
        
           | jofer wrote:
           | Klong is a different language than Python. Python using numpy
           | would look broadly similar for the first few examples, yes.
           | However, this is a way of executing a different language on
           | numpy arrays. Array languages are a different paradigm. Klong
           | is an array language, while Python + numpy allows some array
           | paradigms, but isn't an array language.
           | 
           | Notice how they're _defining_ the "sum" operation there.
           | Instead of being something builtin, they defined "sum" as
           | {+/x}.
           | 
           | {+/x} is the interesting part. That's Klong. It's not being
           | able to define a "sum" operation, it's that "sum" can be
           | expressed as {+/x}. That's very different than both R and
           | python.
        
       | nils-m-holm wrote:
       | Good to see KlongPy thrive! It is based on Klong
       | (http://t3x.org/klong/), which I do not maintain any more, so I
       | am glad that Brian took over!
        
       | amarcheschi wrote:
       | I'm gonna be the one who asks the dumb question, but someone has
       | to do it: why are expressions evaluated from right to left?
        
         | bear8642 wrote:
         | Because that's what APL does:
         | https://www.jsoftware.com/papers/EvalOrder.htm
         | 
         | It also simplifies things as no need to implement BIDMAS
        
         | abrudz wrote:
         | Think "normal" call syntax like
         | foo(bar(baz(42)))
         | 
         | and then remove the superfluous parens                 foo bar
         | baz 42
         | 
         | The expression is evaluated from right to left.
         | 
         | Now, let's make two of the functions into object members:
         | A.foo(bar(B.baz(42)))
         | 
         | Remove the parens, extracting the methods from their objects,
         | instead feeding each object as a left argument to its former
         | member function:                 A foo bar B baz 42
         | 
         | This is normal APL-style call syntax; right-to-left if you
         | want.
        
       | behnamoh wrote:
       | What are some real-world practical applications of array
       | programming? Because as far as I know, APL did not catch on, and
       | its subsequent versions like J and K and BNQ also did not become
       | useful in industry. Maybe there's something I'm missing. Why
       | would you want to do array programming in Python? What are the
       | advantages over regular Python programming?
        
         | fluorinerocket wrote:
         | I thought they had niche in finance.
        
         | mlochbaum wrote:
         | APL did catch on to some extent, see
         | https://news.ycombinator.com/item?id=39471718 .
         | 
         | Without getting into any discussion of the array paradigm
         | itself, the reason commercial programming is such a winner-
         | take-all system now is hiring and organizational difficulties,
         | not that popular languages are so much more productive than
         | unpopular ones. Building a working application is the easy part
         | of running a business.
        
         | itishappy wrote:
         | Array programming databases (eg: KDB+, Shakti) are (per their
         | own metrics) way faster than the competition. Licensing fees
         | are in the 6 digits, so I'm not comparing them personally, but
         | people pay that exorbitant amount, and I assume they do so for
         | a reason.
         | 
         | https://shakti.com/
        
         | juliangoldsmith wrote:
         | The array languages aren't super popular because of the sharp
         | learning curve. They're a lot different than most other
         | languages, and they have a lot of operators that simply don't
         | exist in something like C++.
         | 
         | A few years ago there was an article about K's use in high-
         | frequency trading. I'm not sure about usage of APL and J,
         | though. BQN is still fairly new, so it will take a while to see
         | much production usage.
         | 
         | If you've ever written code using NumPy, Tensorflow, or
         | PyTorch, you're doing array programming. Those libraries are
         | heavily influenced by the array languages, including taking a
         | lot of terminology (rank, etc.). I've personally found that
         | playing with J and BQN helped me understand Tensorflow, and
         | vice versa.
        
           | behnamoh wrote:
           | > including taking a lot of terminology (rank, etc.).
           | 
           | AFAIK that's a linear algebra term for matrices, not
           | something array programming or Pytorch or TF invented.
        
             | ssfrr wrote:
             | it's an unfortunate terminology collision.
             | 
             | - array languages: rank is the dimensionality of an array,
             | i.e. a vector is rank-1, a matrix is rank-2, a N-D array is
             | rank-N
             | 
             | - linear algebra: rank is the number of linearly-
             | independent columns (and also rows)
             | 
             | So for example, if you have a 5x5 matrix where 4 of the
             | columns are linearly independent, it would be rank-4 in the
             | linear algebra sense, and rank-2 in the array language
             | sense.
             | 
             | I guess (though I've never really thought of it before)
             | that you could say that the array-language definition is
             | the rank (in the linear algebra sense) of the index space.
             | Not sure if that's intentional.
        
       | cturner wrote:
       | I would love something like duolingo, but for an array language.
       | Not a steady series of challenging puzzles like leetcode sites,
       | Rather, a series of questions with high repetition, somewhat like
       | flash cards, easy to drop into for a few minutes.
        
         | 082349872349872 wrote:
         | Have you tried following (working through) Iverson's Turing
         | Award lecture?
         | 
         | https://dl.acm.org/ft_gateway.cfm?id=1283935&type=pdf
        
         | abrudz wrote:
         | See https://github.com/Isaac-Flath/anki and
         | https://github.com/aeamaea/APL_Anki_Decks
        
       | eismcc wrote:
       | KlongPy author here: AMA
        
         | sevensor wrote:
         | Would it make sense to think of this as a compact syntax for
         | numpy? When it comes to array operations, are there differences
         | that go deeper than the syntax?
        
           | eismcc wrote:
           | KlongPy has a lot of other features beyond pure NumPy
           | operations (such as IPC and web server), which you could see
           | as a kind of making use of array operations in some
           | application. You could look at the core Klong language as
           | what you suggest.
        
         | Qem wrote:
         | On Linux rlwrap is used to get the REPL working. Is possible to
         | get the REPL working under PowerShell in a Windows box too?
        
       | faizshah wrote:
       | I've tried several times to give J/k/Q/kdb a chance but I haven't
       | really found a convincing example why this approach is better
       | than say SQL or say numpy/jax etc.
       | 
       | The syntax has the same problem as perl in that you have to learn
       | too many symbols that are hard to look up. And this combined with
       | the tacit style makes it difficult to parse what the code is
       | doing.
       | 
       | I think ruby went in the radically opposite direction in that it
       | provides a similar feature set to perl with similar functional
       | features but everything is human readable text. I feel like
       | there's room for a human readable version of J that relies less
       | on syntactical sugar.
       | 
       | I do think the direction Klong has gone with the syntax is an
       | improvement: https://t3x.org/klong/ambiguity.html
       | 
       | I'm curious if anyone has a go-to article/example of why an array
       | language would be a good choice for a particular problem over
       | other languages?
       | 
       | I know the second chapter of J for C programmers is kinda like
       | that but I didn't really find it convincing:
       | https://www.jsoftware.com/help/jforc/culture_shock.htm#_Toc1...
        
         | ramses0 wrote:
         | There was a step-change improvement for me when I tried
         | expressing some JS patterns via `underscore.js` instead of
         | procedurally: eg: http://underscorejs.org/#each
         | 
         | Thinking of something as `each | map | filter | sum` is waaay
         | less buggy than writing bespoke procedural code to do the same
         | thing. No doubt there is a "cost" to it as well, but the
         | _abstraction_ is valuable.
         | 
         | Now, if there were a "compiler" which could optimize that whole
         | pipeline down and squeeze out the inefficiencies between steps
         | because it could "see" the whole program at the same time. (oh,
         | I don't know, something like `SELECT * FROM foo WHERE a > 100
         | AND b < 9000 LEFT JOIN bar ON ...etc...`)
         | 
         | ...perhaps you could get both an expressivity gain (by using
         | higher level concepts than "for" and "while"), a reduction in
         | bugs (because you're not re-implementing basic work-a-day
         | procedures), and an improvement in efficiency (because the
         | "compiler" can let you express things "verbosely", while it
         | sorts out the details of efficiency gains that would be tenuous
         | to express and keep up to date by hand).
        
           | faizshah wrote:
           | I 100% agree, I think the functional features that have been
           | added across all the popular languages (map, reduce, fold
           | etc.) has been a positive. Nothing demonstrates it better
           | (imo) than purrr in R:
           | https://github.com/rstudio/cheatsheets/blob/main/purrr.pdf
           | 
           | I also think there is some merit to "high syntactical
           | density" clearly if you can see the entire code in one place
           | instead of having to navigate through many files or sections
           | that's beneficial. (Heavily discussed in the last big HN
           | thread: https://news.ycombinator.com/item?id=38981639)
           | 
           | I also think JQ has proven the merit of tacit functional
           | languages in that you can concisely write arbitrary
           | transforms/queries on json that can be more expressive than
           | SQL (many SQL engines have added JSONPath anyway). And I also
           | think postfix is great for processing pipelines.
           | 
           | But I am not totally convinced in the approach of APL/J/Q/KDB
           | for the combination of terse style + prefix + tacit because
           | it makes the code so difficult to read. I think if you took
           | an approach similar to JQ where instead of relying on symbols
           | operators were just human readable words it would be easier
           | to get us half way there to trying out the verb, adverbs etc.
           | approach of the APL family. The problem with making it human
           | readable text is that you lose the conciseness which is part
           | of the draw of the APL family as they want to have a high
           | syntax density and analogous code to mathematical
           | expressions.
        
         | 082349872349872 wrote:
         | The biggest advantage of the terse symbolic style for me comes
         | when you wish to rapidly compare different ways (algorithms) to
         | compute the same function. For instance, Kadane's Algorithm
         | usually takes a dozen+ lines in an algoly language, which would
         | normally be presented in some kind of syntax-coloured code
         | block external to descriptive paragraphs; it takes a dozen-
         | characters in an array language, and therefore variations can
         | be included inline to text describing the alternatives.
        
         | RyanHamilton wrote:
         | https://www.timestored.com/b/kdb-qsql-query-vs-sql/ How many
         | database tables have a date time column and a natural ordering?
         | Most the data I look at. Which makes it crazy that sql is based
         | on unordered sets.
        
           | faizshah wrote:
           | Thanks for the link I think this is a really interesting
           | example:
           | 
           | > In qSQL this is: aj[`sym`time; t; q], which means perform
           | an asof-join on t, looking up the nearest match from table q
           | based on the sym and time column.
           | 
           | > In standard SQL, again you'll have difficulty: sql nearest
           | date, sql closest date even just the closest lesser date
           | isn't elegant. One solution would be:
           | 
           | > WITH cte AS (SELECT t.sym, t.time, q.bid, ROW_NUMBER() OVER
           | (PARTITION BY t.ID, t.time ORDER BY ABS(DATEDIFF(dd, t.time,
           | p.time))) AS rowNum FROM t LEFT JOIN q ON t.sym = q.sym)
           | SELECT sym,time,bid FROM cte WHERE rowNum = 1
           | 
           | > It's worth pointing out this is one of the queries that is
           | typically extremely slow (minutes) on row-oriented databases
           | compared to column-oriented databases (at most a few
           | seconds).
           | 
           | This is a really nice example but I think it's more about
           | this as of join being a really useful operation, it appears
           | both pandas https://pandas.pydata.org/docs/reference/api/pand
           | as.merge_as... And duckdb
           | https://duckdb.org/docs/guides/sql_features/asof_join.html
           | 
           | Pandas:
           | 
           | > pd.merge_asof(t, q, on='time', by='sym',
           | direction='backward')
           | 
           | Duckdb:
           | 
           | > SELECT t.sym, t.time, q.value FROM t ASOF JOIN q ON t.sym =
           | q.sym AND t.time >= q.time;
           | 
           | So it seems more like this is benefitting from a useful
           | feature of time series databases rather than the features of
           | an APL-family language.
           | 
           | Personally I find the pandas syntax to be the most
           | straightforward here.
        
       | dang wrote:
       | Related. Others?
       | 
       |  _KlongPy: Vectorized port of Klong array language_ -
       | https://news.ycombinator.com/item?id=35400742 - April 2023 (8
       | comments)
       | 
       |  _Klong: a Simple Array Language_ -
       | https://news.ycombinator.com/item?id=21854793 - Dec 2019 (73
       | comments)
       | 
       |  _Statistics with the array language Klong_ -
       | https://news.ycombinator.com/item?id=15579024 - Oct 2017 (12
       | comments)
       | 
       |  _Klong - a simple array language_ -
       | https://news.ycombinator.com/item?id=10586872 - Nov 2015 (21
       | comments)
        
       | RestartKernel wrote:
       | I've been meaning to give something APL-ish a shot, but I don't
       | see how such languages can be justified in a non-academic
       | engineering environment. I doubt any company would like to limit
       | themselves to hiring programmers who are capable of maintaining
       | such code.
        
       ___________________________________________________________________
       (page generated 2024-12-02 23:00 UTC)