halfkey: étude in k flat
halfkey is a language that sits halfways between k and the interpreter.
it extends k with custom builtins and writes a new source file:
e.g. k+.c
the purpose of halfkey is mixed scalar/vector code mutating arrays, e.g.
linear algebra, faster (or easier) than k.
ideally it obsoletes itself.
this page runs a 2 stage k.wasm:
stage 1 reads the interpreter source (k.k) intermediate form,
halfkey.k (the language compiler) and wb.k (IR to wasm binary compiler).
it then compiles halfkey's input and emits a new extended k.wasm.
js destroys the current instance and boots to:
stage 2 reads and executes k code that uses the extensions
clicking [run] does just that,
[ast ] shows the parse tree of the generated functions
[code] shows source for the functions for c/go/wat/wasm
[k+.c] downloads the generated source of the extended k interpreter.
everything is in k: halfkey.kk.k(large) wb.kcc.kdrawtree.k
halfkey is 7 times faster than khalfkey functions
name:{[xi;yI;z]...}
is compiled to a native function, e.g in c: K name(K X){..}
the signature is always monadic (argument list)
but from k it is called with fixed arity like a lambda:
name[3;4 5;(1 2;3 4)] /name[3;4 5] projects
the compiler injects unpacking code, type checking/conversion
and handles refcounting.
types
halfkey is statically typed and types are inferred.
within the function there are atoms (ints and floats),
which are true i32/f64, not tagged k-values.
and vectors which are k values but can be indexed as ints or floats,
e.g. for x declared as xI, x[1+y] returns an int.
these verbs are compiled to int & (most)float ops:
-~ neg not
+-*%&|<>=~^ add sub..
unspecified types remain k, and all k monadic and dyadic functions
are available (but no operators, compositions, projections, lambdas..):
+-*%&|<>=~!,^#_$?@.
they compile to function calls to the primitive implementations, e.g.
2#x is Tak(Ki(2),rx(x))
literals and mixed i+k or f+k are converted to k types.
mixed i+f are converted to float.
the above k expression also refcounts automatically: rx(x).
but assignments and indexing does not, we want to be able to
modify k objects directly. see the lu example.
flow
halfkey's syntax is k for ease of parsing (it uses `p"..") but flow control is different
(because k-cond $[x;y..] and while[..] generate complicated byte code with jumps).
instead we use arthuresque W[cond;a;b;c] function call syntax for while,
N[n;..] for a loop with known length and if[a;b] or if[a;b;c] as conditionals.
the latter is detected as expression, e.g. 1+if[a;b;c] if a and b are the same type.
also jump table as switch[i;a;b;c] where cases are consecutive 0,1..default.
infix-postfix-treetable
"x+3*y" /k string form, parse with `p"x+2*y" returns:
(`y;.;3;*;`x;.;+) /kvm byte code, postfix (actually list for k values)
+`t`p`i`s!(..) /treetable, prefix with parent vector
t p i s /type parent int symbol(payloads)
0 `add 0 2 `i
1 `get 0 0 `x
2 `mul 0 2 `i
3 `lit 2 3 `i
4 `get 2 0 `y
there are many possibilities to do the transformations,
e.g. replace x0 with x[0]. byte code is a good fit for find and replace.
treetables are hardest to write and manipulate, but unbeatable for query.
for type inference during compilation, we temporarily add a type column q resulting in tpisq.
prefix
the main translation from postfix to the treetable is the function:
prefix:{[tab;y]
pu:{tab::tab,,+tpisq!x} /push
po:{r:*|tab;tab::-1_tab;r} /pop
li:{n:-x;tab::(n_tab),,|n#tab} /make list (`27)
$[`v~yt:@y;$[(y)~`27;li@*po[]`i /kvm switch see exec.go:^func exec
0~n:64\0+y
pu monadic[y;po[]]
1~n;pu dyadic[y;po[];po[]]
2~n;err`indirect
3~n;pu amend[y;po[];po[];po[];po[]]
4~n;err`drop
5~n;err`jump
6~n;err`jumpifnot
pu quotedverb y]
yt~`s;pu symbol y;pu const[yt;y]]
tab}
it uses a stack (a list of partial k trees) and is applied as:
prefix/[initstack;expression]
at the end one table remains on the stack
basic verbs in byte code have numeric values and can be written
in a special form: e.g. (+;-;*)~(`2;`3;`4), and 2~0+(+)
the dyadic forms are specialized by the parser and add 64 to the
numeric values, e.g. the plus in 0+*|`p"1+2" is 66 not 2.
translate&maxout
catenate two trees is simple if they are tables: x,y
but we have to fix the parent vector
x,y[`p]:0|(#x)+y`p
both trees were written in discretion with their root's parent set to 0N.
it remains negative during translation and is maxed out with the new parent.
0| can be replaced by any other node number, wherever y should be linked into x
the above assumes indexed assign x[`p]:.. (so should modified assign) returns x
completely, not just the right hand side.
this pattern is used throughout halfkey.k
inline assembly
halfkey also patches the k table directly, e.g. this adds
function xtp (which does type checking)
asm:+tpis!(`fun`arg`sym`arg`sym`res`ast`cnd`neq`cal`get`get`ret`cal`lit`ret`get
0N 0 1 0 3 0 0 6 7 8 9 8 7 12 13 6 15
0 0N 0 0N 0 0N 0N 0N 2 0N 0N 0N 0N 0N 1 0N 0N
`xtp`k`x`i`t`k```i`tp`x`t`k`trap`i`k`x)
t,:asm[`p]:0|(#t)+asm`p
it's a painful exercise.
benchmark: lu decomposition 1000x1000
-download k+.c from ktye.github.io/halfkey#lu
-write lu.k
lu:{[A]i:0;j:!#A;P:!#A /k version
while[1<#j
k:i+*&a=m:|/a:abs A[j;i]
P[(i;k)]:P[(k;i)]
A[(i;k)]:A[(k;i)]
A[j:1_j;i]%:A[i;i]
A[j;j]-:A[j;i]*\:A[i;j]
i+:1]
(A;P)}
A:{x^?x*x} /an 5: 5x5 random floats
-compare:
$gcc -O3 k+.c -lm
$time ./a.out lu.k -e 'lu A 1000' /k
$time ./a.out lu.k -e 'LU A 1000' /halfkey
-same for go:
$go build k+.go
$time ./k+ lu.k -e 'lu A 1000'
$time ./k+ lu.k -e 'LU A 1000'
-similar for matmul:
-download k+.c k+.go from ktye.github.io/halfkey#mm
-write mm.k
mm:(+/*)\:
A:{(-x 1)^?*/x}
a:A 1000 2000
b:A 2000 1500
-compare with
$time ./a.out mm.k -e 'mm[a;b]' /k
$time ./a.out mm.k -e 'MM[a;b]' /halfkey
time(s):
LU decomposition matmul
k halfkey k halfkey
c 4.28 0.58 9.8 4.1
go 9.36 1.76 24.9 12.3
todo
- break continue and labels(outer)
e.g. W[`a;1<x;.. <a ..] /continue
and W[`a;1<x;.. >a ..] /break
labels default to W (N in ndo) for inner loop
- allow more types, e.g. chars, symbols
- allow some reductions in for k types, e.g. +/ &/
- avoid some predeclarations
apl evaluation order during compilation sometimes prevents
the compiler from knowing the type of a previously assign
variable. they need to be predeclared as a work-around.
- if[c;block[a;b]] is ugly, if[c;[a;b]] would be better but is hard to parse from byte code