# Collapse OS usage guide If you already know Forth, start here. Otherwise, read doc/primer first. We begin with a few oddities in Collapse OS compared to tradi- tional forths, then cover higher level operations. # Comments Both () and \ comments are supported. The word "(" begins a comments and ends it when it reads a ")" word. It needs to be a word, that is, surrounded by whitespaces. "\" comments the rest of the line. # Cell size and memory map Cell size is hardcoded to 16-bit. Endian-ness is arch-dependent and core words dealing with words will read-write according to native endian-ness. Memory is filled by 4 main zones: 1. Boot binary: the binary that has to be present in memory at boot time. When it is, jump to the first address of this bin- ary to boot Collapse OS. This code is designed to be able to run from ROM: nothing is ever written there. 2. Work RAM: As much space as possible is given to this zone. This is where HERE begins. 3. SYSVARS: Hardcoded memory offsets where the core system stores its things. It's $80 bytes in size. If drivers need more memory, it's bigger. See doc/impl for details. 4. PS+RS: Typically around $100 bytes in size. Their implemen- tation is entirely arch-specific. Overflows aren't checked, PS underflows are checked through SCNT. Unless there are arch-related constraints, these zones are placed in that order (boot binary at addr 0, PSP at $ffff). # Number Literals Whenever a word is parsed in the interpreter loop, we first try parsing the word as a number literal. There are 3 literal types. 1. A 100% digits number is parsed as a decimal (12345). 2. A string starting with $ is parsed as hexadecimal ($ab12). 3. A character inside quotes is parsed as that character ('A'). # Strings and lines Strings in Collapse OS are an array of characters in memory associated with a length. There are no termination. This length, when refering to that string in the different string handling words, is usually passed around as a separate argument in PS. It is common to see "sa sl", "sa" being the string's address, "sl" being its length. How that "sl" is encoded depends on the situation. For example, the LIT" word, which writes the enclosed string and, at runtime, yields "sa sl", is wrapped around a branch word (so that the string isn't evaluated by forth) followed by 2 number literals. When we refer to a "line", it's a string that is of size LNSZ, a constant that is always 64. It corresponds to the size of the input buffer and to the size of a line in a Block (16 lines per block). Because those lines have a fixed length, we sometimes want to know the length of the actual content in it (for example, to EMIT it). When we do so, for example in LNLEN, we go through the whole line and check when is that last visible character, that is, the last one that is higher than $20 (space). That's where our line ends. We don't use any termination character for lines, it's too messy. Blocks might not have them, and when we want to display lines in a visual mode (that is, always the full 64 characters on the screen), we need complicated CR handling. It's simpler to fill lines in blocks with spaces all the way. # Signed-ness For simplicity purposes, numbers are generally considered unsigned. For convenience, decimal parsing and formatting support the "-" prefix, but under the hood, it's all unsigned. This leads to some oddities. For example, "-1 0 <" is false. To compare whether something is negative, use the "0<" word which is the equivalent to "$7fff >". # Branching Branching in Collapse OS is limited to 8-bit. This represents 64 word references (or a bit less if there are literals and branches) forward or backward. While this might seem a bit tight at first, having this limit saves us a non-negligible amount of resource usage. The reasoning behind this intentional limit is that huge branches are generally an indicator that a logic ought to be simplified. So here's one more constraint for you to help you towards simplicity. # Interpreter and I/Os Collapse OS' main I/O loop is line-based. INTERPRET calls WORD which then iterates over the current "input buffer" (INBUF) for characters to eat up. That input buffer is a 64 characters space in SYSVARS where typed characters are buffered from KEY, but that's not always the case. During a LOAD, the input buffer pointer changes and points to one of the 16 lines of the BLK buffer. WORD eats it up just the same, but it ain't coming from KEY anymore. When the 16th line is read, we come back to the regular program. Back to KEY. It always yields a characters, which means it blocks until it yields. It loops over KEY? which returns a flag telling us whether a key is pressed, and if there is one, the character itself. KEY? is an alias which points to a driver implementing this routine. It can also be overridden at runtime for nice tricks. For example, if you want to control your computer from RS-232, you can do "' RX 'EMIT ! # System values Most SYSVARS described in doc/impl have a CONSTANT corresponding to their absolute address. For example, you get the value of "NL" with "NL @" and set it with "NL !". Some SYSVARS are very often used and necessitate faster access. These SYSVARS are split in 2 words: the accessor and the address. For example, we have HERE and 'HERE. HERE returns HERE's value directly and 'HERE returns HERE's address. Therefore, you get HERE with "HERE" and set it with "'HERE !". The list of such SYSVARS is: HERE CURRENT IN( IN> # BEGIN..NEXT Most traditional Forths have DO..LOOP, Collapse OS has BEGIN.. NEXT. It only stores one number on RS instead of 2. It's a number that is decremented at each NEXT and the loop exits when that number is zero. The initial value for this loop counter must be manually placed on RS. Example: 42 >R BEGIN NEXT. # The A register The A register is an out of stack temporary value that often helps minimize stack juggling. Its location is arch- dependent, but it's often in SYSVARS. On register-rich CPUs, it's a register. Access to it is fast, but its downside is that words using it must be careful not to use words that also use the A register. doc/dict indicate such words with *A*. # Dealing with performance bottlenecks Because Collapse OS runs on multiple CPUs, dealing with bottle- necks is a bit tricky. We want to avoid, in arch-independant application code (VE, ME, assemblers, emulators), to maintain bottleneck words in all supported architectures. The way we deal with this situation is by declaring bottleneck words as "back-overridable" with the word ?: (instead of :). This word creates a new word only if the specified name doesn't already exist in the dictionary. With this, what you can do is optionally load "speedup words" for your arch, and then load your app. Your sped-up version will superseed the default, slow version and your bottlenecks will be faster. Example: \ My super app ?: slowstuff ( ... ) ; : myapp ( ... ) slowstuff ( ... ) ; \ My arch-specific speedup CODE slowstuff ( ... ) ;CODE If you load the app without loading speedups, "slowstuff" will be slow, but will work under all arches. If you load your speedups first, then the forth version of "slowstuff" will never be created and "myapp" will refer to the fast "slowstuff" instead. # Mass storage through disk blocks Collapse OS can access mass storage through its BLK subsystem. See doc/blk for more information. # Useful little words In Collapse OS, we try to include as few words as possible into the cross-compiled core, making it minimally functional for reaching its design goals. However, in its source code, it has a section of what is called "Useful little words" at B120 and you'll probably want to load some of them quite regularly because they make the system more usable. # Contexts B122 provides the word "context" allowing multiple dictionaries to exist concurrently. This allows you to develop applications without having to worry too much about name clashes because those names exist in separate namespaces. A context is created with a name like this: context foo \ creates context "foo" When a context is created, it is "branched off" CURRENT as it was at the moment the context was created. To activate a context, call its name (in the case, "foo"). This will do two things: 1. Save CURRENT in the previously active context. 2. Restore CURRENT to where it was the last time "foo" was active (or created). Note that creating a context doesn't automatically activate it. # DOER and DOES> In traditional forths, DOES> is often used with CREATE. Not in Collapse OS. To use the DOES> word, you must pair it with DOER. See doc/primer for details. # Code generation The kernel has 3 words that generate native code and although they're there as support for define words (:, CONSTANT, etc.), they can be used for interesting thing. These words are JMPi! CALLi! and i>! and have the same signature of "n a -- len". For example, let's say that you're debugging the kernel and want to ruthlessly patch a word with another behavior you're trying out. You could do: ' newword ' wordtopatch JMPi! DROP And poof! wordtopatch is now an alias to newword.