types
type | num | what | log2 | bits/element |
| 0 | list | 3 | 64 |
b | 1 | bool | 0 | 8 |
c | 2 | char | 0 | 8 |
i | 3 | int | 2 | 32 |
e | 4 | float | 2 | 32 |
g | 5 | geo(complex) | 3 | 64 |
L[]={3,0,0,2,2,3} log2 sizes are stored in L
memory layout
the k value is a 64-bit pointer type called
U
.
small atoms are stored in the k value itself:
#define t(t,x) ((U)(t)<<61|(x)) the type is stored in the high bits
the functions
tb,tc,ti,te
(types 1 to 4) create bool, char, int and float atoms.
for larger types the pointer in the k value points to the array part and the type is stored in the header 5 bytes before.
-32 -8 -4 0 array layout in memory
[padding][rrit][nnnn][array data] rr: 16bit refcout
↑pointer i: bucket type, t:k-type, nnnn:32bit element count
the type is accessed by the macros
#define tx (x>>61) /for tagged atoms
#define Tx sx[-5] /for values stored in memory
tx
returns 0 if the values are stored in memory and is also used to check if it's a small or large value.
bucket allocator
4 1k 14 1m 24 1g
5 2k 15 2m 25 2g
6 4k 16 4m 26 4g
7 8k 17 8m 27 8g
8 16k 18 16m 28 16g
9 32k 19 32m 29 32g
0 64 10 64k 20 64m
1 128 11 128k 21 128m
2 256 12 256k 22 256m bucket type
3 512 13 512k 23 512m and space(bytes)
memory is managed in blocks. there is only one allocation(mmap) at the start of the program of 32gb of virtual memory.
the memory is stored in the free list
M[29]
. all other buckets
M[0..]
are initially empty(zero value).
when a block of memory is requested, the required bucket type is calculated and the first free block of that type is returned.
if there is none, the next larger block is requested and split into two.
this is done by the function
M_
recursively. the function takes the bucket type as argument
x
and returns a pointer.
Ui(M_,Ux=M[i];x?(W+=1<<i,M[i]=xx,x):30>i?_M(i,M_(i+1))+(2*n0<<i):OO())
W
stores the total allocated memory.
a free block of memory contains a pointer to the next free block at the start.
splitting is done by freeing the next larger block as a smaller one (half size).
_M
releases memory, it has two arguments bucket type and pointer:
f
and
x
and returns the input pointer:
g(_M,W-=1<<f;xx=M[f];M[f]=x)
in fact when a larger block is split, the lower part is used and the upper part is not touched only referenced in the free list, such that the operating system can lazily allocate memory.
reference counting
large k values are refcounted. the refcount 0 when the value is used once and freed when a value with refcount 0 is no longer used.
r_
checks if the k value lives in memory and increases the refcount. it aborts if the refcount overflows 16bits.
f(r_,px||tx?x:M1&++rx?x:OO())
f(_r,P(px||tx,x)P(M1&rx,--rx;x)if(!Tx)n(_r(xU))_M(mx,x-n0);x)
_r
decreases the refcount.
if it was used only once, it frees memory.
if the value is compound (e.g. a general list) it recursively unrefs it's elements before freeing the block.
the macros
Tx rx mx
access type, refcount and bucket type from the array header.
parse
execute
verbs
print
ws print a c-string
wc print a char
wx print a k value
syscalls
vector instructions
b_ is bit from byte(_b byte from bit) pmovemask256 _mm256_movemask_epi8
b2 is bit from int(i.e.size2) pmovmskps256 _mm256_movemask_ps
X9 is ~\
i0&i2 are indexing(byte int)
a0 is at-size-0. i.e. indexing into byte array.
wasm
main
int main(int n,char**_){A=(U)$>N;M[29]=m_(0,n0<<30,3,A?0x1042:0x4022,0,0);M_(0);W=0;
if(*++_)ls(*_);cc[256];W(1)wc(B),c[_w(0,(U)c,256)-1]=0,*c?os(c):0;}
A
tests if the program runs on apple or linux, by checking the address of a function value.
mmap flags have different numbers on these systems.
then it allocates 32gb of virtual memory and requests a bucket of the smallest size, which causes allocator to split the single large block of available memory into smaller buckets.
if there are is a function argument, assume it's a k file and interprete it.
then start the repl, a while loop: write the prompt, read the next line of input into a 256 byte buffer and interprete it. output the result with
os(c)
.