[HN Gopher] What about K?
___________________________________________________________________
What about K?
Author : tosh
Score : 167 points
Date : 2025-02-10 12:51 UTC (10 hours ago)
(HTM) web link (xpqz.github.io)
(TXT) w3m dump (xpqz.github.io)
| sebg wrote:
| A companion guide that I always recommend if interested in K is:
| Q for mortals, found here - https://code.kx.com/q4m3/
|
| Note, from wikipedia: Q serves as the query language for kdb+, a
| disk based and in-memory, column-based database. Kdb+ is based on
| the language k, a terse variant of the language APL. Q is a thin
| wrapper around k, providing a more readable, English-like
| interface.
| mwexler wrote:
| Pulled from above: Coding Style The q gods have
| no need for explanatory error messages or comments since their
| q code is perfect and self-documenting. Even experienced
| mortals spend hours poring over cryptic q error messages such
| as the ones above. Moreover, many mortals eschew comments in
| misanthropic coding macho. Don't.
|
| A more enjoyable read than the parent post.
| nialv7 wrote:
| There's also Nial: https://github.com/danlm/qnial7 which is
| (pardon the oversimplification) APL but with words instead of
| symbols.
| steveBK123 wrote:
| A set of links with good examples of common problems solved in
| Q
|
| https://code.kx.com/phrases/wikipage/
|
| https://code.kx.com/q/kb/programming-idioms/
|
| https://code.kx.com/phrases/
| kvdveer wrote:
| The linked document only contains a warning about how versioning
| is weird, and a description of the syntax. No examples beyond
| trivial one-liners.
|
| What problem is K trying to solve? What does a K program look
| like?
| sz4kerto wrote:
| Absolutely not being sarcastic: one problem it solves is that
| it is very hard to read as a beginner, so it can be
| intimidating (although it becomes much easier to read a bit
| later). This, coupled with the general arrogance of k/q
| practitioners (again, not really saying this in a negative way)
| and that k, kdb, etc. deliberately doesn't give you guardrails
| makes people who write k/q seem a bit 'mythical' and make them
| feel very clever.
|
| So I think k, q and kdb are fun to work with, but one of the
| major components of its success is that it allowed a community
| (in finance) to evolve that can earn 50-150% more than their
| peer groups who do the same work in Java or C++. 10 years ago a
| kx course cost $1500 per person per day.
| pjmlp wrote:
| To note that those are typical prices for enterprise level
| certifications, including some products that some Java or C++
| devs might need to interact with, when working on those kind
| of environments.
| bear8642 wrote:
| K is a fast vector language, used (primarily) for time series
| data analysis.
|
| >What does a K program look like?
|
| You might want to check out
| https://news.ycombinator.com/item?id=40335921
|
| beagle3 and geocar both have various comments you might want to
| search for.
| mananaysiempre wrote:
| > a fast vector language
|
| With an Oracle-style DeWitt clause[1] prohibiting public
| benchmarks.
|
| [1]
| https://mlochbaum.github.io/BQN/implementation/kclaims.html
| rustc wrote:
| Shakti (the latest K implementation by the author of K)
| claims [1] to load a 50gb csv in 1.6 seconds which
| according to them takes 265 seconds with Polars. Has anyone
| independently verified these claims? Is Polars really
| leaving 2 orders of magnitude performance on the table?
|
| [1]: https://shakti.com/ -> Compare -> h2o.k
| bear8642 wrote:
| > [1]: https://shakti.com/ -> Compare -> h2o.k
|
| You can link to the subsections:
| https://shakti.com/compare/h2o.k
| orlp wrote:
| Disclaimer: I work for Polars inc.
|
| As a sanity check I just cloned
| https://github.com/h2oai/db-benchmark, ran the data
| generation script and ran on a 64 core AMD EPYC (AWS
| c7a.16xlarge): import polars as pl
| lf = pl.scan_csv("G1_1e9_1e2_0_0.csv")
| print(lf.select(pl.col.v1.sum()).collect())
|
| The above script ran in 7.58 seconds.
|
| If I change the collect() to collect(new_streaming=True)
| to use the new streaming engine I've been working on, it
| runs in 6.90 seconds.
|
| I can't realistically time the full "read CSV to memory"
| with this 50 GB file on this machine as we start swapping
| (this machine has 128GiB memory) and/or evicting data
| from disk cache (this machine has a slow EC2 SSD attached
| to it), so we do have a blow-up of memory usage (which
| could be as simple as loading small integers into an
| 8-byte Uint64 column). I think it's likely that on K's
| machine the "read full CSV to memory" approach also
| started swapping, giving the large runtime. However, in
| Polars you'd typically write your query using LazyFrames,
| which means we don't actually have to load the full CSV
| into memory.
|
| EDIT: running on a m7a.16xlarge with twice the memory
| (256GiB) once the CSV file is in disk cache Polars can
| parse the full CSV file into an in-memory dataframe in
| 7.68 seconds.
|
| K's claim that it parses the full 50GB CSV in 1.6 seconds
| if true is very impressive regardless.
| mananaysiempre wrote:
| Honestly 7 seconds even just to parse the CSV is already
| pretty impressive, 7GB/s would be simdjson speeds if you
| did it on a single core. Do you have a single-threaded
| parser with really well-tuned SIMD, or a speculative
| parallel one, or ..?
| orlp wrote:
| We have a single-threaded chunker that scans serially
| over the file. This chunker exclusively finds unquoted
| newlines (using SIMD) to find clean parallelization
| boundaries, it doesn't do any further parsing. Those
| parallelization boundaries are then used to feed worker
| threads chunks of data to properly parse into our in-
| memory representation (which mostly follows Arrow).
| LegionMammal978 wrote:
| Would you know how much of the total runtime is devoted
| to the initial chunking process? Amdahl's law would
| prefer an entirely speculative approach in the limit, but
| I could imagine that the 2x overhead might not be worth
| it for reasonable file sizes and core counts.
|
| (But even then, 1.6 s would be quite a feat. It makes me
| wonder if the K implementation is partially lazy, as you
| say typical Polars usage is.)
| orlp wrote:
| It seems from a profile that on the eager engine the
| serial scanner is able to feed ~32 threads worth of
| decoding: https://share.firefox.dev/4hS1eJa.
|
| It might be worth speculating, or at least optimizing the
| serial chunker more. You could theoretically start a
| second serial chunker from the end working backwards but
| that would not be wise with our ordered streams, as the
| decoded data would have to be buffered for a long time.
|
| Similarly on the new streaming engine, each thread is
| active ~half of the time, except the thread running the
| chunking task: https://share.firefox.dev/3WQV9og.
|
| Note that in a lot of realistic workloads on the
| streaming engine compute can happen in between decodes,
| completely hiding the bottleneck. Also all of the above
| is with the file being completely in file cache, if fed
| from a slow SSD it's not a bottleneck whatsoever.
| swiftcoder wrote:
| This is kind of the problem with every introductory text to an
| APL-family language.
|
| I get the idea that one either already knows one needs an array
| programming language, or doesn't grok why anyone would need one
| reedf1 wrote:
| K solves the problem of bank account for two groups of people,
| kX Systems and quants.
| FjordWarden wrote:
| I've only played around with k and APL in my spare time so I
| can't speak to real world problems. It is a ridiculously
| powerful query language, where in SQL you have only started
| writing `SELECT ...`, in k you are already done. But you need
| to have very good tacit knowledge of algorithms and the weird
| syntax to be productive, like oh I need to calculate an
| integral-image of this time-series, but that just a pre-scan
| over addition, boom and you are done. The theory of array
| programming with a focus in combinators is also an interesting
| perspective on functional programming. IMHO not something you
| should write full program in, but that hasn't stopped some from
| trying.
| bee_rider wrote:
| This was a helpful comment. After the article, the question
| that popped into my head was... so ok should I try and
| compare this to like BLAS or something like Jax?
|
| But, this sort of language is more about writing and reading
| from the disk efficiently, right? I guess SIMD type
| optimizations would be less of a thing.
| FjordWarden wrote:
| I think that array languages have historically used memory
| mapped files for IO, and treat them like a big data frame,
| but other versions also support streaming IO. Its up to the
| implementers of the runtime to use SIMD instructions if
| they deem this optimal but not something you would use
| yourself.
| Pet_Ant wrote:
| I feel like measuring things in characters is not meaningful,
| but only in tokens. Replacing "SELECT" with "SEL" would not
| improve SQL in the slightest.
| Thorrez wrote:
| A one-liner in k tends to be equivalent to a much larger
| program in another language.
|
| Here's a program in k. I'm not sure exactly what it does. I
| think it might be a json encoder/decoder:
|
| https://github.com/KxSystems/kdb/blob/master/e/json.k
| cubefox wrote:
| It appears you accidentally linked to log where someone fell
| on his keyboard.
| andai wrote:
| I think Whitney's greatest achievement isn't even any of
| his languages--though they are very impressive--but that he
| convinced banks to pay him millions of dollars to write
| IOCCC style code!
| bregma wrote:
| Dialup modems on a bad connection used to generate more
| readable code.
| saghm wrote:
| It says a lot that the name of the file for is more
| informative about what the code does than the entirety of the
| file itself. "Readability is a property of the reader"
| indeed, but also the writer...
| poulpy123 wrote:
| The problem solved by K is the long-term employment of people
| writing K. You can't be fired if you're the only one
| understanding more or less the codebase
| dboreham wrote:
| This is true about more software development than you
| realize.
| z5h wrote:
| > Readability is a property of the reader, not the language.
|
| Similarly, the inability of a person to write machine code
| directly is a property of the person, not the hardware. Yet some
| of these people admit their limitations and use K.
| pyrale wrote:
| Silicon computers are a crutch for the people too flawed to run
| their calculations in their head.
| Ygg2 wrote:
| "It is by will alone I set my mind in motion. It is by the
| juice of Sapho that the thoughts acquire speed, the lips
| acquire stains the stains become a warning. It is by will
| alone I set my mind it motion..."
| psychoslave wrote:
| Do you mean that not everyone use butterfly yet? How quaint!
| coder543 wrote:
| Butterflies? Easier to catch a caterpillar and set it on the
| right path to one day flap its wings in just the desired way.
| tempodox wrote:
| Let me sell you an AI that genetically engineers
| caterpillars to do just that.
| jagged-chisel wrote:
| But can it also provide instructions to set the universal
| constants at the start so the universe evolves the
| desired result?
| tempodox wrote:
| I just asked it and it said yes.
| gpderetta wrote:
| Let there be light!
| Ygg2 wrote:
| Caterpillar? Too high tech. I just restart the universe
| until the problem is pre-solved.
| camdv wrote:
| Chinese has a readability issue to the English speaker. That
| doesn't mean it's not readable.
| bobthepanda wrote:
| The original comment is a joke
| gitonthescene wrote:
| This comment would seem to address the point of that joke
| hcfman wrote:
| I build a language called K for my masters thesis in 1984. Who
| was first ?
| mlochbaum wrote:
| You win! Whitney was just out of graduate school at the time,
| and had worked some with APL at I.P. Sharp but was implementing
| "object-oriented languages, a lot of different LISPs,
| Prolog"[0]. Next was the more APL-like A around 1985 and K only
| in 1992.
|
| [0] https://queue.acm.org/detail.cfm?id=1531242
| ZeroCool2u wrote:
| I cannot warn folks against using q/kdb+ enough. Use Polars or
| DuckDB, get the job done, and enjoy your life.
| jerjerjer wrote:
| Eh, no need. Author states in the first two paragraphs that
| there are 9 versions of k, each developed from scratch and
| incompatible with each other. Anyone who develops software for
| money should and would leave immediately. I do appreciate the
| honesty, though.
| 7thaccount wrote:
| What has your experience been like? What are the drawbacks
| besides the cost and proprietary nature?
| ZeroCool2u wrote:
| I don't want to be too disparaging, so I will just say that
| the language is exotic. Otherwise, the licensing model is
| Oracle-esque based on host and core counts etc. The software
| is fast, that you cannot deny, though it does critically
| depend on the speed of the storage attached to the host.
| Also, it's written in C++ and it shows. Had to do multiple
| (paid) upgrades due to memory leaks.
|
| I'm sure there was a time it was best in class and even now
| maybe it's the best for a few niche use cases, but unless
| you're absolutely certain you need it, I would flee from it
| and save your sanity.
| 7thaccount wrote:
| I thought it was written in just plain C based off old
| Arthur Whitney stories.
|
| Yeah...Oracle licensing sounds scary and having to pay to
| fix their own memory leaks sounds frustrating.
|
| Thanks for the experience.
| boothby wrote:
| > and enjoy your life.
|
| As somebody who hacks on, around and in esoteric languages for
| fun; I must object.
| ZeroCool2u wrote:
| And as someone that has written an interpreter from scratch
| in F#, and since there's a free trial version, I'd say go for
| it and have fun and live your best life! Just perhaps
| reconsider allowing your livelihood to be dependent on it :)
| HexDecOctBin wrote:
| What is the difference between APL and all the various APL-like
| languages like BQN, J and K? Which one should a beginner start
| with? Which has the best tooling for debugging, type checking,
| etc.?
| radiator wrote:
| I think the best today cannot be APL, because it carries so
| much historical baggage and because commercial implementations
| dominate it. So start with BQN, it is free, has the tooling and
| it also has succeeded in building a community.
| skruger wrote:
| Depends. APL is the OG. Try a few and see what you like. If you
| learn one Iverson language, it's pretty easy picking up the
| others.
|
| Here's a gentle guide to APL by the same author (me):
|
| https://xpqz.github.io/learnapl/
|
| Dyalog APL is likely the best supported in terms of tooling,
| debugging etc. If you're looking for static typing, you're in
| the wrong place.
| tomku wrote:
| There's several ways to look at the differences.
|
| The one that will jump out at most programmers who are familiar
| with mainstream languages is that J, k, q and Nial use ASCII
| characters while APL, BQN and Uiua prefer glyphs. q and Nial
| additionally favor words rather than shortened abbreviations,
| and Uiua has plain words that auto-format to its glyphs to aid
| in typing. The other glyph-based languages rely on custom
| (software) keyboard layouts or input methods to let you type
| the symbols they need. You do not need a special keyboard to
| program in any of these languages. ASCII-or-not is not a
| decision that any of the array languages have made lightly or
| for purely aesthetic reasons, it has deep consequences for how
| the languages feel that won't really make sense until you get
| some hands-on experience. As a beginner you'll probably
| gravitate towards one of the sides without understanding those
| deeper implications, and that's totally okay, but please keep
| an open mind.
|
| If access to a high-quality open-source implementation is
| important for you, your options narrow a bit. J, BQN, Uiua and
| Nial all have a primary implementation that's open source. k
| has implementations that are open-source but the official
| versions of k that most people use "in anger" are commercial
| products with a limited free trial, and afaik there's no mature
| open-source versions of kdb+/q, which are kind of k's killer
| app. There are many implementations of APL but Dyalog is the
| clear leader and it's a closed-source commercial product with a
| personal/non-commercial free version. I wish this was less of a
| factor because it's so hard to get people interested in
| languages when the best versions aren't available to them, but
| it has gotten better in recent years.
|
| Regarding tooling, you should go in with minimal expectations.
| Some of the tooling is quite good (particularly J and Dyalog
| APL, in my opinion) but it's heavily biased towards the
| specific type of iterative, interactive development that nearly
| all array programmers favor. Debuggers are sometimes present
| but usually not a primary tool. None of the major array
| languages have static typing. There are some array-adjacent
| languages like Futhark and Dex that do, but they're very
| different than the "Iversonian" array languages you asked
| about, and are also active research projects.
|
| (Edit: Also worth mentioning that package managers and build
| systems are not common in the array world.)
|
| There are many other differences that matter immensely to the
| array community but you won't have context for as a beginner,
| so I'm not going to go too deep into them, but if you're
| curious, https://github.com/codereport/array-language-
| comparisons has some comparison tables and example code written
| in a variety of languages. code_report/Conor's Youtube channel
| at https://www.youtube.com/@code_report/ is also an excellent
| place to get exposure to various array languages and concepts.
|
| All that said, in my opinion the easiest languages to recommend
| to get started are BQN and J, depending on whether you want
| glyphs or not. If you're comfortable using a closed-source tool
| with restrictive licensing, Dyalog APL is also an excellent
| choice. Any of the three will show you both the joys and pains
| of array programming if you put time into learning it, and give
| you enough context to make an informed decision about going
| deeper or finding another array language more to your taste.
| cess11 wrote:
| J has an Android interpreter, which for me as a non-
| professional dabbler is the killer app since it means I can
| study and play on my handheld devices when I'm on a break
| from work or family.
|
| The documentation is pretty decent compared to the other
| members of the Iverson gang and the libraries one can install
| with the desktop version makes it somewhat batteries
| included, at least it's easy to suck in a file and start
| rendering plots.
|
| Maybe BQN can compete on these things nowadays, I'm not sure.
| dzaima wrote:
| You can run BQN in termux on Android pretty well. A list of
| libraries is available at
| https://github.com/pellertson/awesome-bqn.
| https://mlochbaum.github.io/BQN/ has pretty good
| documentation.
| rscho wrote:
| The main difference separating them is the array model. APL has
| the so-called 'nested array' model, meaning that everything is
| an array. J has a 'flat array model' meaning scalars are
| distinct from arrays. Both models introduce typing
| inconsistencies preventing efficiency. BQN tried to remedy this
| and use an efficient compiler. What sets K apart is that it
| does not have multidimensional arrays, but just lists of 1D
| arrays. This makes K ideal for financial work, while the others
| are more non-financial math-oriented.
| jamal-kumar wrote:
| I always thought it sounded super cool but it just doesn't exist
| in the problem spaces I work in. Like kdb+ was specifically
| designed to be run on bare metal without a full OS in the way of
| things going fast, and in quant environments where you're trying
| to shave off nanoseconds on the computations because your
| company's gone and invested in a dedicated fibre line to trading
| servers.
| eudhxhdhsb32 wrote:
| That's actually not true at all. No one who cares about
| nanoseconds is using kdb+ for a production trading system.
|
| It's primarily used for trading research and surveillance, not
| live trading. And I've never heard of anyone running it without
| an OS.
| bear8642 wrote:
| > And I've never heard of anyone running it without an OS.
|
| kOS is in development though current status is unknown.
|
| (https://gist.github.com/chrispsn/da00835bb122c42f429a084df83
| ...)
| 7thaccount wrote:
| I think that got abandoned ages ago.
| WorkerBee28474 wrote:
| > No one who cares about nanoseconds is using kdb+ for a
| production trading system.
|
| For those curious, what they're actually using is FPGAs and
| custom silicon.
| pie_flavor wrote:
| > Readability is a property of the reader, not the language.
|
| Uiua[0]'s stack model is much more annoying to work with, but I
| really appreciate its embrace of unicode glyphs. Every other
| derivative of APL throws those out at the first opportunity, but
| when you have a lot of glyphs, you stop being so tempted to make
| different arities cause the same glyph to mean wildly different
| things, when the arity is not actually written down explicitly
| and depends on whether the next thing to the left is a parameter
| or another function. Once you can See The Matrix, _this_ is the
| chief thing that still does make K and friends objectively
| unreadable in a way they don 't have to be.
|
| [0]: https://uiua.org
| xg15 wrote:
| I appreciate the idea (and Uiua's examples indeed look
| beautiful, almost like visual programming) but I'd at least
| like some obvious way how to _pronounce_ the code.
| RodgerTheGreat wrote:
| All of the symbols in Uiua have short english words as
| alternate names, and the online editor allows you to type
| them by alias.
|
| K has "traditional names" for all the primitive operators
| which appear in reference cards and which are typically used
| when discussing code aloud with other K programmers. Q and
| Lil, which are both K descendants, outright replace some
| symbols with those named keywords. Named keywords can make
| the primitives superficially easier to remember, at the cost
| of making idiomatic patterns in the language less visually
| apparent.
| xg15 wrote:
| Ah, that makes a lot more sense. Thanks!
| blablablerg wrote:
| "K is a general-purpose programming language that excels as a
| tool for data wrangling, analytics and transformation."
|
| How does it compare to R/tidyverse?
| 7thaccount wrote:
| It's mainly for quants where you couple the array language with
| a time series database of all your stock quotes. Once you
| understand the language you can do a ton of analysis with
| extremely little code. Think of it as a mathematical SQL
| dialect I guess.
|
| In my opinion, it's very cool, but Python's ecosystem (and R's)
| is just so much better with scientific libraries and charting
| and all that. Kdb+ (the database) and K the language are likely
| much faster than R for general analysis type stuff. R is also
| free and Kdb+ is not.
| poulpy123 wrote:
| I'm somewhat convinced that there is a middle ground between
| corporate java and languages like K
| tempodox wrote:
| IDK, I'd rather have a language that compiles to native code,
| isn't quite as write-only as that, and doesn't cost an arm and a
| leg, even when using a DB.
| sl0thentr0py wrote:
| i've been doing the last 3 years advent of codes in q/kdb+, it's
| a lot of fun
| https://github.com/sl0thentr0py/aoc/blob/main/aoc2023/3/foo....
| James_K wrote:
| > Strings are just vectors of characters
|
| I hope not.
| lytedev wrote:
| Can you elaborate? Why not?
| James_K wrote:
| A character could be 1 byte long, in which case the language
| cannot properly handle unicode; it could be 4 bytes long in
| which care there is lot of wasted space storing text and it
| cannot properly handle extended grapheme clusters; or a
| character could be arbitrary length at which point strings no
| longer have a flat representation in memory. None of these
| are good. The exact properties of a string can really only be
| encoded efficiently with a flat linear access data-type.
| dzaima wrote:
| 1-byte characters (i.e. what k's typically have) handle
| ASCII just fine, for which doing
| reversing/splitting/uppercase/lowercase/iteration/etc is
| actually meaningful (stock symbols, stringified dates,
| identifiers, etc).
|
| And if you have to handle arbitrary language user input,
| there's basically no operations you can/should actually do
| anyway. Uppercasing/lowercasing? Doesn't make sense on CJK
| languages. Reversing? Completely meaningless. Trimming to
| the first N chars for some visual display/summary/preview?
| Even grapheme clusters won't help avoiding a character with
| ten thousand combining components, and you'll have to do
| language-specific logic to not cut in the middle of a word
| for languages where the display of a prefix of a word may
| change depending on later letters! And forget about spaces
| meaning anything.
|
| Basically the only string ops I can think of that make
| sense for non-ASCII generally would be splitting/joining on
| newlines and escaping for JSON/HTML or whatever, which'll
| work completely fine on a byte list anyway.
|
| There's perhaps some middle-ground of doing things for a
| specific set of languages, but even for such you won't care
| about the storage format anyways, as what matters for you
| is just whether operations you use (presumably using some
| library; and even if you write a manual uppercase for
| French specifically or whatever, you'd notice if you
| implemented it wrongly) do the thing they should.
|
| So a list of byte chars is just fine for anything one would
| actually do, providing optimal access to ASCII, and not
| actually making things worse for non-ASCII.
| James_K wrote:
| Not true at all! Extended grapheme clusters are defined
| by Unicode for a reason and include relevant combining
| marks following a letter[1]. The point more generally is
| that a programming language shouldn't preferentially
| choose one character definition over another. The
| decision of whether to iterate by bytes, points, or
| clusters is a significant one which the language
| shouldn't force upon users. For many common operations,
| bytes are a sufficient representation, but then one must
| be precise about encoding. A list of UTF-8 bytes is very
| easy to deal with but the bytes of a UTF-16 string are
| highly problematic. Inserting a single byte character at
| the start of such a string would destroy it's entire
| content. There is no situation where "give me the
| characters of this string" is a sufficiently precise
| statement, so it should not be made available by
| programming languages. Likewise, the idea of indexing a
| string is not well defined at all. The only consistent
| interface for accessing strings requires users to specify
| both encoding and separation, and this can only be done
| performantly in the general case with a linear scan.
|
| [1] http://unicode.org/reports/tr29/
| dzaima wrote:
| I meant the combining mark point as a thing you _would_
| want to cut off; a 50-char chopped-off "summary" of a
| thing _should not_ include a character with ten thousand
| combining marks ever. Of course it 'd be preferred to cut
| to cut before and not in the middle, but certainly not
| after, which is what you'd get if taking the first 50
| extended grapheme clusters, the 20000-byte glyph counting
| as one. Point being, you still just want to use a library
| that has properly thought out the question. And that
| applies to most (all?) sane fully-Unicode-aware
| operations.
|
| Places where ASCII-only is a known expectation and there
| are meaningful per-char operations are plenty; that's
| what using a list of bytes provides. Indeed you'd
| probably want to use another abstraction if you have non-
| ASCII. And for such you could use something to do the
| form of iteration or operation you want just fine, even
| if the input/output is a list of byte-chars representing
| plain UTF-8.
| James_K wrote:
| Well in that case, the way you get a 50 char summary is
| by iterating grapheme clusters, then counting up to 50
| points and discarding the broken cluster. It's quite
| trivial if the language exposes an interface for
| iterating both clusters and points, and without such an
| interface the problem is much harder to notice. Hence why
| the language shouldn't prefer clusters to points or
| points to clusters. It should expose all relevant
| representations without prejudice.
|
| Even if ASCII is appropriate in some situation, this
| should be stated within the program. Requiring people to
| be explicit about the data they produce and consume is
| important and useful. A user might decide that UTF-16
| best serves their need (or be working on the Windows
| platform) in which case code which works with strings as
| linear sequences will be able to operate on their strings
| without issue. Code which assumes a UTF-8 byte
| representation will require an the entire string to be
| allocated, converted, then reallocated and converted
| back. Huge overhead and potential incompatibility for no
| reason.
| dzaima wrote:
| > It's quite trivial if the language exposes an interface
| for iterating both clusters and points, and without such
| an interface the problem is much harder to notice
|
| I assure you, 99% of people won't handle this correctly
| even if given a cluster-based interface (if they even
| bother using it). And this still doesn't handle the
| question of cutting words in the middle of some languages
| resulting in broken display of the non-cut part (or
| languages without space-based word boundaries to cut on).
| So the preferred thing is still to use a library.
|
| I don't think anyone in k would use UTF-16 via a
| character list of 2 chars per code unit; an integer list
| would work much nicer for that (and most k interpreters
| should be capable of storing such with 16-bit ints;
| there's still some preference for using UTF-8 char lists,
| namely, such get pretty-printed as strings); and you'd
| have to convert on _some_ I /O probably anyway. Never
| mind the world being basically all-in on UTF-8.
|
| Even if you have a string type that's capable of being
| backed by either UTF-8 or UTF-16, you'll still need
| conversions between those at some points; you'd want the
| Windows API calls to have a
| "str.asNullTerminatedUTF16Bytes()" or whatnot (lest a
| UTF-8-encoded string makes its way here), which you can
| trivially have an equivalent of for a byte list. And I
| highly doubt that overhead of conversion would matter
| anywhere you need a UTF-16-only Windows API.
|
| I doubt all of those fancy operations you'll be doing
| will have optimized impls for all formats internally
| either, so there's internal conversions too. If anything,
| I'd imagine that having a unified internal representation
| would end up better, forcing the user to push the
| conversions to the I/O boundaries and allowing focus on
| optimizing for a single type, instead of going back-and-
| forth internally or wasting time on multiple impls.
| mlochbaum wrote:
| I think it's worth considering that application
| development and GUIs really aren't K's thing. For those,
| yes, you want to be pretty careful about the concept of a
| "character", but (as I understand it) in K you're more
| interested in analyzing numerical data, and string
| handling is just for slapping together a display or
| parsing user commands or something. So a method that lets
| the programmer use array operations that they already
| have to know instead of learning three different non-
| array ways to work with strings is practical. Remember
| potential users are already very concerned about the
| difficulty of learning the language!
| fc417fc802 wrote:
| Python uses UTF-8. A Python string is iterable. It is
| generally reasonable to describe any iterable as a vector
| (at least in terms of the API). The result of such
| iteration might not be a character in any formal sense, but
| it's a reasonable description nonetheless.
|
| I'm really not seeing the issue here.
| khazhoux wrote:
| Developers act like they forgot about K
| pjmlp wrote:
| What seems to be a pity about most array languages is that in
| theory, they would be ideal DSL languages for SIMD and MIMD code
| exploration, but as far as I understand from ArrayCast guests,
| most are still interpreters at heart focusing on plain CPU
| execution.
| dzaima wrote:
| The big problem with using array languages for lower-level SIMD
| stuff is that that generally requires some amount of typedness,
| but tacking on types on an array language without ending up
| with having types be the majority of the syntax and code (or
| taking up a ton of mental capacity if utilizing very heavy type
| inference) would be rather non-trivial. And the operations you
| want for lower-level ops are quite different from the higher-
| level general-purpose ones too. (and, of course, some
| interpreters do make good use of SIMD and/or multithreading)
|
| That said, some form of array language more suited for stuff
| like that is a somewhat common question; maybe one day someone
| will figure it out.
|
| Vanessa McHale is doing some interesting work on a typed
| compilable array language, Apple[0].
|
| [0]: https://github.com/vmchale/apple/?tab=readme-ov-
| file#apple-a...
| Pompidou wrote:
| Maybe codfns for apl will solve this ? That's what I
| understood.. but maybe I'm wrong.
| airstrike wrote:
| Side note, but some people have such an evident talent for
| writing that it makes reading about _any_ topic a worthwhile
| experience. This author, Stefan Kruger, seems to be one of them.
|
| I almost wish this link was to a blog rather than to a book about
| K, for which I only have a perennial curiosity.
|
| Here's to hoping they consider writing said blog. I notice they
| have one but it only has 3 posts, all of which are about past
| Advent of Code puzzles.
| bear8642 wrote:
| The about section has links to other things he's
| written/presented.
| IshKebab wrote:
| > The same baseless accusations of "unreadable", "write-only" and
| "impossible to learn" are leveled at all Iversonian languages, k
| included.
|
| I'd be really curious to know if they really are baseless. It's
| very very difficult to imagine that K developers can _really_
| read a mess like this as easily as one might read Go or whatever.
|
| https://github.com/KxSystems/kdb/blob/master/e/json.k
|
| Has anyone tested this? Take a K program and ask a K developer to
| explain it? Or maybe introduce a deliberate bug and see how long
| they take to fix it compared to other languages. You could
| normalise the results based on how long it takes them to write
| some other code.
|
| Free research project for any compsci researchers out there...
| (though good luck finding skilled K programmers).
| geocar wrote:
| > It's very very difficult to imagine that K developers can
| really read a mess like this as easily as one might read Go or
| whatever.
|
| Shui Luo Shi Chu .
|
| > Has anyone tested this? Take a K program and ask a K
| developer to explain it?
|
| I am not sure what you're asking. Do you want me to read it to
| you?
|
| Here is me reading some other people's code:
|
| https://news.ycombinator.com/item?id=8476633
|
| https://news.ycombinator.com/item?id=22010223
|
| Do you want me to read to you the JSON encoder (written twice)
| and the decoder in this way?
|
| > Or maybe introduce a deliberate bug and see how long they
| take to fix it compared to other languages.
|
| https://news.ycombinator.com/item?id=27209093#27223086
|
| > You could normalise the results based on how long it takes
| them to write some other code.
|
| https://news.ycombinator.com/item?id=22459661#22467866
|
| https://news.ycombinator.com/item?id=31361023#31364262
| michaelg7x wrote:
| It's entirely possible, have done it at few times. For example,
| the `fby` verb[?] annoyed me one too many times, so I pulled it
| apart to see what was going on. In contrast to json.k it's
| quite short. I usually split each separable idea into a new
| line and introduce a bunch of new variables to track state that
| would otherwise be passed from right to left. Lengthy end-of-
| line comments are my chosen way of understanding q or k when I
| come back to anything later.
| russellbeattie wrote:
| > _" there is no single definitive k, but instead a sequence of
| slightly incompatible versions. If you decide to stick with k,
| you'll see mentions of k4, k5 etc."_
|
| I don't know about the qualities of k itself, but I think the
| idea of having a common practice for experimental programming
| languages to be grouped under a single name like "E" with a
| number is quite attractive.
|
| There are lots of students, hobbyists, researchers, professional
| devs and companies who are developing their own working
| programming language. There are a million of them, all with their
| own names. 99.9% of them are ignored, or criticized unfairly by
| others expecting fully fleshed out features.
|
| I can imagine a GitHub repo where you can register a new language
| "En" (with n being a number) rather than it living in obscurity
| on a random website. Then others can jump in and experiment with
| the language and give it feedback, fork it, etc.
|
| This isn't just for toy languages, but for big organizations like
| Google. Instead of naming a not-fully-baked C++ successor as
| "Carbon" and getting flak for it not being ready for real world
| code yet, they could simply call it "E321" and the status of the
| language would be self-explanatory.
|
| Then if one of the E languages gains enough traction, it could
| "graduate" to its own named language.
|
| I also like the cred that an "official" E language could get when
| a dev talks about it to others. Everyone would immediately know
| it was experimental and where to see the code.
| sedatk wrote:
| "k" was used in lowercase throughout the article, including the
| title.
| nottorp wrote:
| > As you've landed here, you've clearly somehow sought out k, and
| you likely have an idea what it's about.
|
| Author didn't expect to end up on HN then :)
| shric wrote:
| My first thought was "weird, they made a language called k even
| though there is already a language called K". I then realized
| it's actually talking about K.
|
| Thoroughout the article it's spelled k consistently except at the
| start of a sentence. This is weird. The language is K not k.
| Nobody spells the C language as c.
| cess11 wrote:
| It's actually rather common to spell its as k, as well as K. I
| think q is more common than Q.
___________________________________________________________________
(page generated 2025-02-10 23:00 UTC)