[HN Gopher] KlongPy: High-Performance Array Programming in Python
___________________________________________________________________
KlongPy: High-Performance Array Programming in Python
Author : cl3misch
Score : 97 points
Date : 2024-12-02 06:40 UTC (16 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| solidsnack9000 wrote:
| I gather where it says `:monad` it is referring to an operation
| that had an effect on the interpreter state?
| Pompidou wrote:
| No. In APL deriverd array programming languages, verbs (or
| functions) are monadic or dyadic : they accept only one or two
| arguments :
|
| In '1 + 1', + is a dyadic operator, while in 'exp 5', exp is
| monadic.
|
| In J, and in APL I guess, left arg is usually understood as
| 'control data', while right arg is the data upon which
| calculation is done. Left argument is usually left unchanged
| after calculations.
|
| In this way, is it possible to create multi-arguments verbs, by
| placing boxed args on the left of the verb.
| itishappy wrote:
| It sounds unrelated to the monads popularized by Haskell
| tutorials. Naming coincidence or am I missing something?
| odo1242 wrote:
| Naming coincidence, they are in fact unrelated.
| skruger wrote:
| APL's use of 'monad' predates Haskell's existence by
| decades.
| 082349872349872 wrote:
| ...and Leibniz's "windowless monads" (all three concepts
| are pairwise distinct, related solely by the greek prefix
| for 'single') predates APL and Haskell by centuries.
| solidsnack9000 wrote:
| A `monad` in Category Theory (which is where Haskell gets
| them from, ultimately) is a single-argument functor and two
| natural transformations, so there probably is a through-
| line here in terms of one-argument function-likes.
| solidsnack9000 wrote:
| Thanks kindly.
| incrudible wrote:
| Naive array programming is not really high performance in my
| book, because performing a lot of trivial arithmetic on large
| arrays leads to poor locality and high bandwidth pressure. The
| better alternative is SPMD, i.e. something like CUDA or ISPC for
| CPUs. This is possible with some type of JIT if the numpy style
| of programming is to be maintained, for example tinygrad.
| cl3misch wrote:
| If you like high-performance array programming a la "numpy with
| JIT" I suggest looking at JAX. It's very suitable for general
| numeric computing (not just ML) and a very mature ecosystem.
|
| https://github.com/jax-ml/jax
| CyberDildonics wrote:
| You are absolutely right, naive array programming might be much
| faster than raw python but it will never be high performance
| because you can't use caches and memory bandwidth effectively.
| bunderbunder wrote:
| The unstated major premise here is that you're working with
| data that's big enough that this becomes your major
| bottleneck.
|
| There's also a subset - possibly even a silent majority - of
| problems where you're doing fairly intensive calculation on
| data that fairly comfortably fits into a modern CPU cache.
| For those, I'd still expect SIMD to be a great choice from a
| performance perspective.
| CyberDildonics wrote:
| I'm not exactly sure what this means, but the bandwidth to
| the level 3 cache isn't actually more than memory. If you
| only have a few megabytes to work with, it's a lot less
| likely performance is going to matter anyway, and if it
| does you would still want to do multiple operations on each
| array item if possible, because if you put a few more lines
| in a loop you're talking about registers at that point, not
| cache.
| dzaima wrote:
| L3 cache bandwidth is often more than main memory; most
| systems at https://old.chipsandcheese.com/memory-
| bandwidth-data/ give at least a ~2x difference from
| whatever sample I clicked.
| bunderbunder wrote:
| And on a recent generation Xeon, your mid-level data
| cache can total up to 64MB, depending on model. That's
| maybe the more interesting one? L3 cache is shared among
| cores, so access to it can be more heavily impacted by
| memory concurrency concerns and not just pure bandwidth
| concerns. Meaning there's potentially a strong incentive
| to keep your task size below 2MB per core.
| CyberDildonics wrote:
| What are doing that is computationally intensive and are
| using python and a CPU with 64MB of cache for, but is
| under 64 MB total? How do you know you aren't losing
| performance by doing single operations on each array in
| python?
| bunderbunder wrote:
| Model training and inference, for example. Despite the
| hype, there's actually a lot of data science work that
| uses modestly-sized sets of structured data and doesn't
| really need a billion parameter deep learning model. And
| even when you're running inference on larger datasets
| it's often fine to just use mini batches.
|
| I don't really think of it as a "losing" performance
| problem. It's all about tradeoffs. SIMD tooling often
| offers an easier programming model than SPMD tooling, and
| spending an extra 60 minutes of development time to save
| myself 60 seconds of run time is generally not a
| worthwhile trade in my problem space. Especially if I
| also need to run it on more expensive compute instances
| to realize that speedup.
| CyberDildonics wrote:
| It seems like instead of a "silent majority doing
| intensive calculations" what you're really saying is "I
| don't think of slower speeds as a performance loss, I
| don't pay attention to that, I don't care about slower
| speeds, my data is small and my programs run almost
| instantly".
|
| If someone says "this isn't really high performance" and
| you say "I don't personally care, my stuff is small and
| doesn't take long time to run" that isn't a coherent
| reply.
| mlochbaum wrote:
| Memory traffic is certainly the biggest problem that the array
| paradigm presents for implementation, yes. I'd quibble with
| calling that "poor locality": when working with large arrays,
| if any part of a cache line is accessed it's very likely the
| entire line will be used in short order, which is the
| definition of good locality as I understand it. The issue is
| simply high memory use and a large number of accesses.
|
| I think it's an oversimplification to say SPMD is a better
| model full stop. Array programming has the advantage that the
| implementer optimizes a specific function, and its interaction
| with the rest of the program is much simpler: all the data is
| available at the start, and the result can be produced in any
| order. So things like multi-pass algorithms and temporary
| lookup tables are possible, particularly valuable if the
| program isn't spending that much time on arithmetic but rather
| "heavy" searching and sorting operations. Not that I claim
| NumPy or CuPy do a good job of this. Sections "Fusion versus
| fission" and "Dynamic versus static" are relevant here, I
| think:
| https://mlochbaum.github.io/BQN/implementation/versusc.html#...
| kstrauser wrote:
| This is awesome. I've wanted to play with array languages before
| but they tend to be kind of a pain in the neck to start with
| locally. I like the idea of hacking my way around in the new
| language while keeping a Python escape hatch close by in case I
| haven't yet learned how to do a thing the new way.
|
| Nice.
| MPSimmons wrote:
| I thought I knew python, but then I saw this:
| ?> sum::{+/x} :" sum + over / the array x :monad
| ?> sum([1 2 3]) 6 ?> count::{#x} :monad
| ?> count([1 2 3]) 3
|
| What?
| MPSimmons wrote:
| oooooooh I think I'm starting to see...
|
| sum::{} is the creation of a monadic function (which I just
| learned about because I don't have a CS background)
|
| :" is a comment
|
| sum([...]) is calling the function with a list, and the
| function will iterate over the list adding each of the elements
|
| count::{#x} is another monadic function that returns the number
| of elements, which then gets called
| gshulegaard wrote:
| I just briefly reviewed the README docs, I believe the
| KlongPy is a custom language which transpiles to Python. The
| REPL block you are trying to interpret is the KlongPy REPL
| not a Python one.
|
| Embedding KlongPy in a Python block would look more like this
| (also from the docs): from klongpy import
| KlongInterpreter import numpy as np data
| = np.array([1, 2, 3, 4, 5]) klong =
| KlongInterpreter() # make the data NumPy array
| available to KlongPy code by passing it into the interpreter
| # we are creating a symbol in KlongPy called 'data' and
| assigning the external NumPy array value
| klong['data'] = data # define the average function in
| KlongPY klong('avg::{(+/x)%#x}') # call the
| average function with the external data and return the
| result. r = klong('avg(data)') print(r) #
| expected value: 3
|
| Note the calls to "klong('<some-str-of-klong-syntax')".
| itishappy wrote:
| I'm new too, so take my opinion with a grain of salt, but I
| think you're right with one minor correction:
| name::{body} :" function declaration
|
| This isn't necessarily niladic or monadic or dyadic on it's
| own, it depends on the body. The bodies from the example all
| just happened to be monadic. +/x
| :" monadic : sum over argument x #x :"
| monadic : count of the argument x + :"
| dyadic : sum !10 :" niladic : an array from
| 0 to 9
|
| I had to look up how calling a dyadic function looks in
| Klong: add::{+} add(2;3)
| MPSimmons wrote:
| Man, that's so weird. Thanks for the clarification.
| nils-m-holm wrote:
| The langiage in the examples is Klong, not Python. KlongPy
| translates Klong to Python and then executes it.
| asloan23 wrote:
| Sorry if this is a dumb question, why is this weird in Python?
| I use R as my main language and this is exactly how I would
| expect it to work. Does Python not do this naturally?
| faizshah wrote:
| > sum::{+/x}
|
| > count::{#x}
|
| These two expressions are not valid in R or Python. They are
| implicitly declaring a function without declaring the
| arguments, it is implicitly iterating over the array and
| returning the result (all features of array languages to make
| it more concise).
|
| Python's + and / are a function of two objects (dyadic) and
| are syntactic sugar (they can't be treated as function
| values, thats what the operator module is for): https://docs.
| python.org/3/reference/datamodel.html#object.__...
|
| The array languages optimize for concise code so a lot of
| things are done implicitly that are done explicitly in
| Python.
|
| The python equivalent:
|
| > _sum = lambda x : functools.reduce(operator.add, x)
|
| > _count = lambda x: functools.reduce(lambda acc, _: acc + 1,
| x, 0)
|
| Of course in python and R there are built ins for these
| already (sum(x) and len(x)) but this is just to show what
| array languages are doing for you.
| jofer wrote:
| Klong is a different language than Python. Python using numpy
| would look broadly similar for the first few examples, yes.
| However, this is a way of executing a different language on
| numpy arrays. Array languages are a different paradigm. Klong
| is an array language, while Python + numpy allows some array
| paradigms, but isn't an array language.
|
| Notice how they're _defining_ the "sum" operation there.
| Instead of being something builtin, they defined "sum" as
| {+/x}.
|
| {+/x} is the interesting part. That's Klong. It's not being
| able to define a "sum" operation, it's that "sum" can be
| expressed as {+/x}. That's very different than both R and
| python.
| nils-m-holm wrote:
| Good to see KlongPy thrive! It is based on Klong
| (http://t3x.org/klong/), which I do not maintain any more, so I
| am glad that Brian took over!
| amarcheschi wrote:
| I'm gonna be the one who asks the dumb question, but someone has
| to do it: why are expressions evaluated from right to left?
| bear8642 wrote:
| Because that's what APL does:
| https://www.jsoftware.com/papers/EvalOrder.htm
|
| It also simplifies things as no need to implement BIDMAS
| abrudz wrote:
| Think "normal" call syntax like
| foo(bar(baz(42)))
|
| and then remove the superfluous parens foo bar
| baz 42
|
| The expression is evaluated from right to left.
|
| Now, let's make two of the functions into object members:
| A.foo(bar(B.baz(42)))
|
| Remove the parens, extracting the methods from their objects,
| instead feeding each object as a left argument to its former
| member function: A foo bar B baz 42
|
| This is normal APL-style call syntax; right-to-left if you
| want.
| behnamoh wrote:
| What are some real-world practical applications of array
| programming? Because as far as I know, APL did not catch on, and
| its subsequent versions like J and K and BNQ also did not become
| useful in industry. Maybe there's something I'm missing. Why
| would you want to do array programming in Python? What are the
| advantages over regular Python programming?
| fluorinerocket wrote:
| I thought they had niche in finance.
| mlochbaum wrote:
| APL did catch on to some extent, see
| https://news.ycombinator.com/item?id=39471718 .
|
| Without getting into any discussion of the array paradigm
| itself, the reason commercial programming is such a winner-
| take-all system now is hiring and organizational difficulties,
| not that popular languages are so much more productive than
| unpopular ones. Building a working application is the easy part
| of running a business.
| itishappy wrote:
| Array programming databases (eg: KDB+, Shakti) are (per their
| own metrics) way faster than the competition. Licensing fees
| are in the 6 digits, so I'm not comparing them personally, but
| people pay that exorbitant amount, and I assume they do so for
| a reason.
|
| https://shakti.com/
| juliangoldsmith wrote:
| The array languages aren't super popular because of the sharp
| learning curve. They're a lot different than most other
| languages, and they have a lot of operators that simply don't
| exist in something like C++.
|
| A few years ago there was an article about K's use in high-
| frequency trading. I'm not sure about usage of APL and J,
| though. BQN is still fairly new, so it will take a while to see
| much production usage.
|
| If you've ever written code using NumPy, Tensorflow, or
| PyTorch, you're doing array programming. Those libraries are
| heavily influenced by the array languages, including taking a
| lot of terminology (rank, etc.). I've personally found that
| playing with J and BQN helped me understand Tensorflow, and
| vice versa.
| behnamoh wrote:
| > including taking a lot of terminology (rank, etc.).
|
| AFAIK that's a linear algebra term for matrices, not
| something array programming or Pytorch or TF invented.
| ssfrr wrote:
| it's an unfortunate terminology collision.
|
| - array languages: rank is the dimensionality of an array,
| i.e. a vector is rank-1, a matrix is rank-2, a N-D array is
| rank-N
|
| - linear algebra: rank is the number of linearly-
| independent columns (and also rows)
|
| So for example, if you have a 5x5 matrix where 4 of the
| columns are linearly independent, it would be rank-4 in the
| linear algebra sense, and rank-2 in the array language
| sense.
|
| I guess (though I've never really thought of it before)
| that you could say that the array-language definition is
| the rank (in the linear algebra sense) of the index space.
| Not sure if that's intentional.
| cturner wrote:
| I would love something like duolingo, but for an array language.
| Not a steady series of challenging puzzles like leetcode sites,
| Rather, a series of questions with high repetition, somewhat like
| flash cards, easy to drop into for a few minutes.
| 082349872349872 wrote:
| Have you tried following (working through) Iverson's Turing
| Award lecture?
|
| https://dl.acm.org/ft_gateway.cfm?id=1283935&type=pdf
| abrudz wrote:
| See https://github.com/Isaac-Flath/anki and
| https://github.com/aeamaea/APL_Anki_Decks
| eismcc wrote:
| KlongPy author here: AMA
| sevensor wrote:
| Would it make sense to think of this as a compact syntax for
| numpy? When it comes to array operations, are there differences
| that go deeper than the syntax?
| eismcc wrote:
| KlongPy has a lot of other features beyond pure NumPy
| operations (such as IPC and web server), which you could see
| as a kind of making use of array operations in some
| application. You could look at the core Klong language as
| what you suggest.
| Qem wrote:
| On Linux rlwrap is used to get the REPL working. Is possible to
| get the REPL working under PowerShell in a Windows box too?
| faizshah wrote:
| I've tried several times to give J/k/Q/kdb a chance but I haven't
| really found a convincing example why this approach is better
| than say SQL or say numpy/jax etc.
|
| The syntax has the same problem as perl in that you have to learn
| too many symbols that are hard to look up. And this combined with
| the tacit style makes it difficult to parse what the code is
| doing.
|
| I think ruby went in the radically opposite direction in that it
| provides a similar feature set to perl with similar functional
| features but everything is human readable text. I feel like
| there's room for a human readable version of J that relies less
| on syntactical sugar.
|
| I do think the direction Klong has gone with the syntax is an
| improvement: https://t3x.org/klong/ambiguity.html
|
| I'm curious if anyone has a go-to article/example of why an array
| language would be a good choice for a particular problem over
| other languages?
|
| I know the second chapter of J for C programmers is kinda like
| that but I didn't really find it convincing:
| https://www.jsoftware.com/help/jforc/culture_shock.htm#_Toc1...
| ramses0 wrote:
| There was a step-change improvement for me when I tried
| expressing some JS patterns via `underscore.js` instead of
| procedurally: eg: http://underscorejs.org/#each
|
| Thinking of something as `each | map | filter | sum` is waaay
| less buggy than writing bespoke procedural code to do the same
| thing. No doubt there is a "cost" to it as well, but the
| _abstraction_ is valuable.
|
| Now, if there were a "compiler" which could optimize that whole
| pipeline down and squeeze out the inefficiencies between steps
| because it could "see" the whole program at the same time. (oh,
| I don't know, something like `SELECT * FROM foo WHERE a > 100
| AND b < 9000 LEFT JOIN bar ON ...etc...`)
|
| ...perhaps you could get both an expressivity gain (by using
| higher level concepts than "for" and "while"), a reduction in
| bugs (because you're not re-implementing basic work-a-day
| procedures), and an improvement in efficiency (because the
| "compiler" can let you express things "verbosely", while it
| sorts out the details of efficiency gains that would be tenuous
| to express and keep up to date by hand).
| faizshah wrote:
| I 100% agree, I think the functional features that have been
| added across all the popular languages (map, reduce, fold
| etc.) has been a positive. Nothing demonstrates it better
| (imo) than purrr in R:
| https://github.com/rstudio/cheatsheets/blob/main/purrr.pdf
|
| I also think there is some merit to "high syntactical
| density" clearly if you can see the entire code in one place
| instead of having to navigate through many files or sections
| that's beneficial. (Heavily discussed in the last big HN
| thread: https://news.ycombinator.com/item?id=38981639)
|
| I also think JQ has proven the merit of tacit functional
| languages in that you can concisely write arbitrary
| transforms/queries on json that can be more expressive than
| SQL (many SQL engines have added JSONPath anyway). And I also
| think postfix is great for processing pipelines.
|
| But I am not totally convinced in the approach of APL/J/Q/KDB
| for the combination of terse style + prefix + tacit because
| it makes the code so difficult to read. I think if you took
| an approach similar to JQ where instead of relying on symbols
| operators were just human readable words it would be easier
| to get us half way there to trying out the verb, adverbs etc.
| approach of the APL family. The problem with making it human
| readable text is that you lose the conciseness which is part
| of the draw of the APL family as they want to have a high
| syntax density and analogous code to mathematical
| expressions.
| 082349872349872 wrote:
| The biggest advantage of the terse symbolic style for me comes
| when you wish to rapidly compare different ways (algorithms) to
| compute the same function. For instance, Kadane's Algorithm
| usually takes a dozen+ lines in an algoly language, which would
| normally be presented in some kind of syntax-coloured code
| block external to descriptive paragraphs; it takes a dozen-
| characters in an array language, and therefore variations can
| be included inline to text describing the alternatives.
| RyanHamilton wrote:
| https://www.timestored.com/b/kdb-qsql-query-vs-sql/ How many
| database tables have a date time column and a natural ordering?
| Most the data I look at. Which makes it crazy that sql is based
| on unordered sets.
| faizshah wrote:
| Thanks for the link I think this is a really interesting
| example:
|
| > In qSQL this is: aj[`sym`time; t; q], which means perform
| an asof-join on t, looking up the nearest match from table q
| based on the sym and time column.
|
| > In standard SQL, again you'll have difficulty: sql nearest
| date, sql closest date even just the closest lesser date
| isn't elegant. One solution would be:
|
| > WITH cte AS (SELECT t.sym, t.time, q.bid, ROW_NUMBER() OVER
| (PARTITION BY t.ID, t.time ORDER BY ABS(DATEDIFF(dd, t.time,
| p.time))) AS rowNum FROM t LEFT JOIN q ON t.sym = q.sym)
| SELECT sym,time,bid FROM cte WHERE rowNum = 1
|
| > It's worth pointing out this is one of the queries that is
| typically extremely slow (minutes) on row-oriented databases
| compared to column-oriented databases (at most a few
| seconds).
|
| This is a really nice example but I think it's more about
| this as of join being a really useful operation, it appears
| both pandas https://pandas.pydata.org/docs/reference/api/pand
| as.merge_as... And duckdb
| https://duckdb.org/docs/guides/sql_features/asof_join.html
|
| Pandas:
|
| > pd.merge_asof(t, q, on='time', by='sym',
| direction='backward')
|
| Duckdb:
|
| > SELECT t.sym, t.time, q.value FROM t ASOF JOIN q ON t.sym =
| q.sym AND t.time >= q.time;
|
| So it seems more like this is benefitting from a useful
| feature of time series databases rather than the features of
| an APL-family language.
|
| Personally I find the pandas syntax to be the most
| straightforward here.
| dang wrote:
| Related. Others?
|
| _KlongPy: Vectorized port of Klong array language_ -
| https://news.ycombinator.com/item?id=35400742 - April 2023 (8
| comments)
|
| _Klong: a Simple Array Language_ -
| https://news.ycombinator.com/item?id=21854793 - Dec 2019 (73
| comments)
|
| _Statistics with the array language Klong_ -
| https://news.ycombinator.com/item?id=15579024 - Oct 2017 (12
| comments)
|
| _Klong - a simple array language_ -
| https://news.ycombinator.com/item?id=10586872 - Nov 2015 (21
| comments)
| RestartKernel wrote:
| I've been meaning to give something APL-ish a shot, but I don't
| see how such languages can be justified in a non-academic
| engineering environment. I doubt any company would like to limit
| themselves to hiring programmers who are capable of maintaining
| such code.
___________________________________________________________________
(page generated 2024-12-02 23:00 UTC)