https://xpqz.github.io/kbook/Introduction.html

What about k?

[                    ]
  * Introduction

  * Vectors everywhere
  * Built-in verbs
  * Adverbs
  * Dicts

Powered by Jupyter Book
.ipynb .pdf
repository open issue
Contents

  * Why k?
  * Open source k
  * Nomenclature and style
  * About this book

IntroductionP

k is a family of concise, fast vector-oriented languages designed by
Arthur Whitney. Calling k a "family" is deliberate; there is no
single definitive k, but instead a sequence of slightly incompatible
versions. If you decide to stick with k, you'll see mentions of k4, 
k5 etc. Exactly what those numbers mean aren't important-they're
"generations", rather than versions, and not every generation was
ever made available to the public, or even completed. A Python 2/3
split every time, if you like.

Reputedly, Arthur always starts from scratch when making the next
generation of k, happily and deliberately sacrificing backward
compatibility in order to build something better and faster, cut fat
or revert design decisions that didn't pan out. The bleeding edge k
is being developed right now (at the time of writing) by Arthur's
latest venture, Shakti. Shakti-k is dubbed by some as k9. If this all
sounds a bit anarchic, that's because it is. There is no expectation
that code written for one generation will always work unchanged in
the next. Embrace it. Evolution is healthy.

The main commercial version of k available today is q/kdb+ from kX
systems, Arthur's previous venture. It's stable, fast, "batteries
included" and really the benchmark k against which others are
measured. kX's k version is usually thought of as k4. However, kX's
main product is the language q-a k-derivative (implemented in k) that
looks (a bit) more like a traditional programming language, and the
freakishly fast, distributed columnar store kdb+. kX views k as
"exposed infrastructure" and actively discourages its users from
using it. kX provides exceptionally good documentation for q which is
always worthwhile reading: even if q isn't k, it's close enough many
times. k9 currently has Arthur-style documentation only; the
ref-card.

A commercial license for q/kdb+ is eye-wateringly expensive and
likely out of reach for hobbyists. You can, however, run it for free
under a non-commercial evaluation license, but note that the product
is "tethered", and so sends telemetry data back to base and cannot be
used without an active internet connection.

You can try k9 under a free non-commercial evaluation license, too,
but k9 is very much a moving, changing target at the time of writing.

Why k?P

I'd like to avoid the advocacy piece, as there seems to be little
middle ground. As you've landed here, you've clearly somehow sought
out k, and you likely have an idea what it's about. K, like its
Iversonian siblings APL and J, values conciseness, or perhaps we
should say terseness, of representation and speed of execution.

The same baseless accusations of "unreadable", "write-only" and
"impossible to learn" are leveled at all Iversonian languages, k
included. I covered that a bit in the introduction to my book on APL,
and so won't repeat that here. The k cognoscenti, when told the
language is unreadable (again) will simply show you the whites of
their eyes, mumble "whatever", and get on with their lives.
Readability is a property of the reader, not the language.

K is a general-purpose programming language that excels as a tool for
data wrangling, analytics and transformation. The analytics use case
really drove both its inception and adoption in the financial
industry. Compared with APL it's more consistent (partly a
consequence of Arthur's propensity for always starting from scratch),
perhaps a smidge less mathematically pure (instead choosing to
optimise for speed and pragmatism), fewer "batteries included" and
with a vector-, rather than array-oriented model.

If you come to k knowing APL or J, the transition is pretty
pain-free. If k is your first foray into vector and array languages,
coming perhaps from Python or JavaScript, the learning ramp will feel
steeper than you're used to, but at least the vector model will feel
more familiar than APL's rank. In k, like in Python, depth and rank
are the same thing.

K is a teeny, tiny language. It has no libraries to speak of. You can
pick up the basics in an afternoon. In terms of complexity, it's
about as hard as learning regular expressions. However, like with
most things, it takes tons of practice to become good at it. In the
hands of a master practitioner, it's truly a sight to behold.

If you persist, you'll find you have a computational superpower at
your fingertips. You'll learn new ways to think about data. Going
back to a mainstream language will feel dull and boring.

I think of k as APL's "more punk rock little sister". It's not for
everyone.

Open source kP

There is a small, but thriving, open source k community, and there is
probably half a dozen or more open source k implementations of k
knocking around of varying degree of ambition. The k language
deserves a future outside the commercial implementations marketed to
the financial industry, and this book will deal solely with open
source k. If you're a hedge fund jock who sees k as a way to get
ahead, you're welcome here, too, but note that we'll not touch on
either q, kdb+ or the Shakti equivalents.

This book is written as a jupyter-book, using the ngn/k kernel
developed specifically for this book. All examples are thus run under
ngn/k, which is a k6. You can run ngn/k directly in your browser,
too, and we'll link all examples to a live, web-based repl if you
want to experiment.

Installing ngn/k requires you to get your hands dirty and build from
source. At the moment, it's buildable on Linux, FreeBSD, OpenBSD and
MacOS.

John Earnest's oK is another k5/6, available on-line. Most examples
here should run unchanged under oK, and if that's not the case, we'll
try to point that out. oK has a well-written manual in addition to
the traditional Arthur-style ref-card, and John's also written a more
general intro to programming in k, too.

There is also kona, a k3, and ktye/k, which is the only open source k
to support some of the ksql database extensions.

Nomenclature and styleP

K uses nomenclature introduced in the J language (another APL
derivative) with which Arthur was deeply involved. You may come
across Arthur's sketched J interpreter fragment, known as the J
Incunabulum, and been either awestruck or repelled that one can do
such unspeakable things in c. In J, and also in k, we borrow terms
from linguistics, rather than from mathematics, to describe the
building blocks of the language. J/k folk insist that this makes it
easier to learn and understand, but if you're already versed in other
programming languages, this is sure to grate a bit in the beginning.
What the hell do you mean "adverb"?

In k, we have nouns, verbs and adverbs, rather than data, functions
and operators. We'll stick to these conventions. K, like APL and J,
also uses the words "monadic" and "dyadic" to refer to verbs or
adverbs taking one or two arguments respectively. "Monadic" has
nothing to do with Haskell monads, you'll be pleased to hear.

Traditions in k dictate that code should be as terse as possible. The
use of unnecessary whitespace and non-single-letter names for things
are seen as signs of inexperience. However, for the purposes of this
book, we'll make no apologies for breaking such style guide lines
where we think this aids clarity.

About this bookP

I can't claim to be an expert on k. Like with my APL book, this is a
jazzed-up version of the notes I took when learning. There is a lack
of accessible introductory texts to k for experienced practitioners
of other languages, and this is my contribution to help plug that
gap. Moreover, k deserves to break free of its association with the
financial industry, and perhaps this can help with that, too.

Ks have traditionally been 'documented' via a so-called ref-card.
This is the briefest possible listing of built-ins, usually with zero
context or explanation. This is part of the sometimes unhelpful
mythology surrounding k, and - some might argue - a semi-deliberate
barrier: you're expected to get yourself to a point where you also
believe that the ref-card is sufficient documentation, and so
complete the circle. ngn/k, the dialect we're chiefly concerned with
here, carries on with the ref-card tradition, but its version is
actually unusually comprehensive. You can view its ref card(s) with a
few backslash commands in the repl, starting with a single \ for the
index page:

\

\   help               \\         exit
\a  license(AGPLv3)    \l file.k  load
\0  types              \t:n expr  time(elapsed milliseconds after n runs)
\+  verbs              \v         variables
\:  I/O verbs          \f         functions
\'  adverbs            \cd path   change directory
\`  symbols            \other     command(through /bin/sh)
\h  summary

By no means should you expect to be able to actually learn k from the
ref cards, but - and I do feel slightly uncomfortable saying it -
you'll soon find them indispensable. For example, \+ lists the
built-in [S:functions:S] verbs with both monadic and dyadic forms in
one compact screen:

\+

Verbs:    : + - * % ! & | < > = ~ , ^ # _ $ ? @ . 0: 1:
notation: [c]har [i]nt [n]umber(int|float) [s]ymbol [a]tom [d]ict
          [f]unc(monad) [F]unc(dyad) [xyz]any
special:  var:y     set    a:1;a -> 1
          (v;..):y  unpack (b;(c;d)):(2 3;4 5);c -> 4
          :x        return {:x+1;2}[3] -> 4
          $[x;y;..] cond   $[0;`a;"\0";`b;`;`c;();`d;`e] -> `e
          o[..]     recur  {$[x<2;x;+/o'x-1 2]}9 -> 34
          [..]      progn  [0;1;2;3] -> 3

::  self      ::12 -> 12
 :  right     1 :2 -> 2   "abc":'"d" -> "ddd"
 +x flip      +("ab";"cd") -> ("ac";"bd")
N+N add       1 2+3 -> 4 5
 -N negate    - 1 2 -> -1 -2
N-N subtract  1-2 3 -> -1 -2
 *x first     *`a`b -> `a   *(0 1;"cd") -> 0 1
N*N multiply  1 2*3 4 -> 3 8
 %N sqrt      %25 -> 5.0   %-1 -> 0n
N%N divide    4 3%2 -> 2 1   4 3%2.0 -> 2.0 1.5
 !i enum      !3 -> 0 1 2   !-3 -> -3 -2 -1
 !I odometer  !2 3 -> (0 0 0 1 1 1;0 1 2 0 1 2)
 !d keys      !`a`b!0 1 -> `a`b
 !S ns keys   a.b.c:1;a.b.d:2;!`a`b -> ``c`d
x!y dict      `a`b!1 2 -> `a`b!1 2
i!I div       -10!1234 567 -> 123 56
i!I mod       10!1234 567 -> 4 7
 &I where     &3 -> 0 0 0   &1 0 1 4 2 -> 0 2 3 3 3 3 4 4
 &x deepwhere &(0 1 0;1 0 0;1 1 1) -> (0 1 2 2 2;1 0 0 1 2)
N&N min/and   2&-1 3 -> -1 2   0 0 1 1&0 1 0 1 -> 0 0 0 1
 |x reverse   |"abc" -> "cba"   |12 -> 12
N|N max/or    2|-1 3 -> 2 3   0 0 1 1|0 1 0 1 -> 0 1 1 1
 <X ascend    <"abacus" -> 0 2 1 3 5 4
 >X descend   >"abacus" -> 4 5 3 1 0 2
 <s open      fd:<`"/path/to/file.txt"
 >i close     >fd
N<N less      0 2<1 -> 1 0
N>N more      0 1>0 2 -> 0 0
 =X group     ="abracadabra" -> "abrcd"!(0 3 5 7 10;1 8;2 9;,4;,6)
 =i unitmat   =3 -> (1 0 0;0 1 0;0 0 1)
N=N equal     0 1 2=0 1 3 -> 1 1 0
 ~x not       ~(0 2;``a;"a \0";::;{}) -> (1 0;1 0;0 0 1;1;0)
x~y match     2 3~2 3 -> 1   "4"~4 -> 0   0~0.0 -> 0
 ,x enlist    ,0 -> ,0   ,0 1 -> ,0 1   ,`a!1 -> +(,`a)!,,1
x,y concat    0,1 2 -> 0 1 2  "a",1 -> ("a";1)
 ^x null      ^(" a";0 1 0N;``a;0.0 0n) -> (1 0;0 0 1;1 0;0 1)
a^y fill      1^0 0N 2 3 0N -> 0 1 2 3 1   "b"^" " -> "b"
X^y without   "abracadabra"^"bc" -> "araadara"
 #x length    #"abc" -> 3   #4 -> 1   #`a`b`c!0 1 0 -> 3
i#y reshape   3#2 -> 2 2 2
I#y reshape   2 3#` -> (```;```)
f#y replicate (3>#:')#(0;2 1 3;5 4) -> (0;5 4)   {2}#"ab" -> "aabb"
x#d take      `c`d`f#`a`b`c`d!1 2 3 4 -> `c`d`f!3 4 0N
 _n floor     _12.34 -12.34 -> 12 -13
 _c lowercase _"Ab" -> "ab"
i_Y drop      2_"abcde" -> "cde"   `b_`a`b`c!0 1 2 -> `a`c!0 2
I_Y cut       2 4 4_"abcde" -> ("cd";"";,"e")
f_Y weed out  (3>#:')_(0;2 1 3;5 4) -> ,2 1 3
X_i delete    "abcde"_2 -> "abde"
 $x string    $(12;"ab";`cd;+) -> ("12";(,"a";,"b");"cd";,"+")
i$C pad       5$"abc" -> "abc  "   -3$"a" -> "  a"
s$y cast      `c$97 -> "a"   `i$-1.2 -> -1   `$"a" -> `a
s$y int       `I$"-12" -> -12
 ?x uniq      ?"abacus" -> "abcus"
X?y find      "abcde"?"bfe" -> 1 0N 4
i?x roll      3?1000 -> 11 398 293   1?0 -> ,-8164324247243690787
i?x deal      -3?1000 -> 11 398 293 /guaranteed distinct
 @x type      @1 -> `b   @"ab" -> `C   @() -> `A   @(@) -> `v
x@y apply(1)  {x+1}@2 -> 3   "abc"@1 -> "b"   (`a`b!0 1)@`b -> 1
 .S get       a:1;.`a -> 1   b.c:2;.`b`c -> 2
 .C eval      ."1+2" -> 3
 .d values    .`a`b!0 1 -> 0 1
x.y apply(n)  {x*y+1}. 2 3 -> 8   (`a`b`c;`d`e`f). 1 0 -> `d

@[x;y;f]   amend  @["ABC";1;_:] -> "AbC"   @[2 3;1;{-x}] -> 2 -3
@[x;y;F;z] amend  @["abc";1;:;"x"] -> "axc"   @[2 3;0;+;4] -> 6 3
.[x;y;f]   drill  .[("AB";"CD");1 0;_:] -> ("AB";"cD")
.[x;y;F;z] drill  .[("ab";"cd");1 0;:;"x"] -> ("ab";"xd")
.[f;y;f]   try    .[+;1 2;"E:",] -> 3   .[+;1,`2;"E:",] -> "E:typ"
?[x;y;z]   splice ?["abcd";1 3;"xyz"] -> "axyzd"

Are you ready? Let's crack open the kool-aid.

 

next

Vectors everywhere

By Stefan Kruger, Computational Array and Magic
(c) Copyright 2021.

Creative Commons License

This work is licensed under a Creative Commons
Attribution-NonCommercial-ShareAlike 4.0 International License.