||1 week ago|
|examples||1 week ago|
|lib||1 week ago|
|termios||3 years ago|
|.gitignore||3 years ago|
|Makefile||2 years ago|
|README.md||1 week ago|
|TODO.md||1 week ago|
|builtins.go||1 week ago|
|files.go||2 weeks ago|
|go.mod||3 years ago|
|go.sum||2 years ago|
|helpers.go||1 week ago|
|main.go||1 week ago|
|memory.go||2 weeks ago|
|names.go||2 weeks ago|
|nimf.1||2 years ago|
|tokenizer.go||1 week ago|
Nimf it an interpreted language written in golang. It bears no relationship to the language 'nim', which is very different. Nimf is an implementation of a concatenative language in a more or less forth-ish style. It has been created mostly as a learning exercise, but is definitely usable for certain types of programming tasks as well as for educational purposes.
After languishing for a few years while I worked on my other language (slope), I found some time to return to nimf and write a proper lex/parse cycle. This simplified a lot of the execution, allowed for better debugging/error tracing, and hopefully will be a better foundation to build on in the future. These changes do have the effect of moving away from the pseudo-register based code approach for certain parse features; for example, strings are now a parsed item, and not a flag that is toggled to change the interpreter mode.
- Building/Installing Nimf
- The Language
- The Interpreter
- Syntax Highlighting
A Go compiler is required. I am not certain how old a version will work, but there are no external dependencies and I imagine anything >= 1.11 should be fine. Nimf was developed with Go 1.12 and 1.13.
A BSD compatible makefile has been included and contains a few targets of interest:
makewill build the interpreter's binary in the current directory
make installwill install the binary, manpage, and standard library to the
/usr/localfile tree (adjustable via the
- This may require administrator privileges (
sudoor the like) depending on your system setup
- This may require administrator privileges (
make install-localwill build the binrary in the current directory and install the standard library to either
- This is useful on systems where you do not have administrator privileges
make install-libwill install the standard library to the
- This is generally desireable when developing core libraries and not wanting to build everything every time
- Like the previous one, but installs to either
- Like the previous one, but installs to either
go install is also an option, but it wont help you out with the standard lib and the manpage.
Nimf is currently hosted in a git repository which can be found at git.rawtext.club/nimf-lang/nimf. The main documentation is this README. However, full api documentation is coming soon and will be hosted over https/gopher/gemini.
Things will look at first glance more or less like forth (nimf stands for: nimf is mostly forth). Anything inside of
) is a comment. Nimf only supports this type of commenting, which works over multiple lines and can be anywhere inside a line, but at present cannot be nested.
Upon launching the REPL only the builtins will be loaded. These powerful words provide the building blocks for all of the nimf libraries, but lack a lot of the niceties of using the modules/libraries. nimf comes with a number of modules that can be loaded using
inline, which will bring all of the variables and words into the global scope. Most modules also call other modules. For example, the
text module inlines the
std module. That means that if you only inline
text, you also have access to
std. However, it is recommended to be explicit and load all of the modules you intend to use rather than relying on hidden loading. There is no performance cost to calling
inline on a module that has already been loaded: it will not be reloaded and no additional parsing will need to be done.
"text" inline "Hello, world" str.print-buf
In the above we first
text library. Nimf makes a distinction between builtins, which are coded in Go, and module functions, which are coded directly in Nimf from the primitives offered by the builtins. The text library contains words prefixed with
str (for character and string, respectively).
After inlining the
text module we start a string literal with
" and finish the string literal with
". Note that unlike in a traditional forth system, strings in nimf are handled by a lexer/parser and are not handled via
" as a word itself (early versions of nimf did have
" as a word, but it was changed to syntax in order to have more useful strings). The string is saved to temporary string storage (starting at memory address
50). The word
str.print-buf is used to print the string currently held in temporary string storage. The temporary storage gets used by many words and should not be counted on to store a word for any longer than the next word call. The word could instead be saved to a variable:
"text" inline "Hello, world" svar myString myString str.print
In the above we start out like before, but instead of just printing we first call
svar reserves enough memory for the string currently in the temporary string buffer and copies that buffer over to that memory. It then creates a reference to that memory via the word that comes after
svar. in this case
myString. We then use
myString, which puts the address of the string onto the stack, and
str.print which takes the address of of the top of the stack and prints that address.
str.print is used in
str.print-buf which is implemented as
str.buf-addr str.print (with
str.buf-addr being the address of the temporary string buffer). As such, we could also have done:
"Hello, world" str.buf-addr str.print. There are lots of options.
Each of the below will leave the result on the stack. We could call
, after each one to print the result, but here we will call
.s at the end to view the stack itself.
5 6 + ( 11 ) 3 9 - ( -6 ) 2 4 * ( 8 ) 7 2 / ( 3 ) 7 2 % ( 1 ) 7 2 /% ( 1, 3 ) .s ( Prints: <7> [ 11, -6, 8, 3, 1, 1, 3 ] )
.s builtin prints the stack. The first value in the example above (
<7>) is how many items are on the stack, followed by the stack itself.
There are a number of numerical helpers available in the
num module (
" num " inline) as well and some expanded comparison opperators available in the
std module (on its own nimf provides
: squared ( n -- n*n :: squares a number ) dup * ;
: starts a subroutine declaration, everything between
) is a comment,
dup makes a copy of the top item on the stack,
* multiplies the TOS by the item under it, and
; ends the subroutine definition.
We can now call it like so:
5 squared .
. drops and prints the top item on the stack (TOS) and adds a space after it, to avoid the space use
,. The above would output
Variables default to being sized to a single cell (size of
int on your system, likely either 32 or 64 bits). From this basic variable you can store numbers, characters, flags, other memory addresses, etc. You can also extend variables for use in more complex structures or easily store and retrieve strings in memory.
A major limitation, and thus an adjustment when coming to nimf from many other languages, is that words do not have local variables/scope. A variable cannot be created within a subroutine and must exist in global space. To work around this limitations you will often see a word move the contents of a global variable onto the return stack at the beginning of the word, then use that variable/memory during operation, then at the end of the word move the values from the return stack back to their variables. This methodology allows you to use global variables with local values in a local scope and then return their state after you are done. If this sounds complicated or is a little beyond your usage for the language at present: don't worry. You will likely know when you need it and it will likely make more sense then. To see an example of how that sort of value passing via the return stack might work you can look at the
text module. At the very bottom it defines two words that move the
text module values onto and off of the return stack. They are called in a lot of the words in the
Variables can be named anything you like with the following exceptions:
- A variable, or a word for that matter, cannot contain only digits (ex.
23) as the interpreter will treat this as an integer
- A variable or word name cannot take the form of a decimal, hexidecimal, octal, or binary number. Using numbers in a var name or word is fine, so long as they do not match the established patterns for these number forms
- A variable or word name cannot contain whitespace
As a convention, not enforced at a code level, private words can be created by using the following variable/word naming scheme: module-name.private.name. For example:
url.private.port would be part of the
url module and is intended to only be used internally so is marked private, it is then given the name
private in this way excludes any variables and words containing
.private. from the word listing provided with the word
words. In reality you can still call private words if you like, but private is one way a developer can provide intent to other developers.
var myvar myvar . ( The address of myvar: 101234 ) myvar get . ( The value of myvar: 0 ) 5 myvar set myvar get . ( The value of myvar: 5 ) 8 myvar +! ( Adds 8 to the value of myvar - not to the address ) myvar get . ( The value of myvar: 13 )
The first line, above, adds the name
myvar to the dictionary and assigns it a memory address. The second line prints the address of
myvar. The third line prints the value at that address. All variables can be thought of as pointers, for those familiar with the term. You can get the value stored at an address with
@ (they are the same). Since we have not given
myvar a value yet, the value is
! builtins update a memory address. The following line shows that the value has been updated. The
+! subroutine adds to the value at an address and the last line shows the result of this update.
Nimf has the ability to store variables that take up multiple cells. This can be done with the
allot keyword. Using allot will allow you to create something like arrays and is used often for managing strings.
: ? ( addr -- ) @ . ; var myArray ( Reserve a memory address ) 5 allot ( myArray is already 1 cell, '5 allot' adds 5 more for a total of 6) 5 myArray set ( Set myArray to 5, so that the length of the array can be referenced ) 9 myArray 1 + set ( Set the the first offset, the first non-length value, to 9 by referencing `myArray 1 +` ) 2 myArray 2 + set ( Set the second offset to 2 by referencing `myArray 2 +` ) myArray ? myArray ++ ? myArray 2 + ? ( 5 9 2 )
In the above example we first create a simple way to print the value at a memory location via the word
? we create a variable
myArr. We then
allot 2 extra cells for this variable.
allot does not take a memory address, it just expands the most recently created variable. We then update each cell in the "array" to a value, this is done by providing an offset to
myArray. Lastly we use our newly created
? to view the value at each offset position.
allot opperates in an increasing manner on the next available memory. You cannot define a variable (A) then another variable (B) and go back an allot more for (A). You must
allot before creating any other variables. Note that it is possible that inlining a module may also assign new variables and make it so that you can no longer allot for a variable you created, so be aware of this limitation of
allot when creating additional variables or inlining modules. Best practice is to initialize a
svar and then immediately
"text" inline "hello" svar hi ( Puts 'hello' into the temporary string buffer, reserves memory space, copies the string into it, and adds a new word, 'hi' ) hi . ( Memory address: 63214 ) hi str.print ( Outputs: hello ) str.print-buf ( Outputs: hello ) "hola" str.print-buf ( Outputs: hola ) hi str.print ( Outputs: hello ) hi @ . ( Outputs: 5, the length of the string ) hi 2 + @ emit ( Outputs: e, the second char )
We first inline the
text module so that we have access to the print oriented words (which could easily be define on your own without the need for
text, but that is not in scope here). We then create a string
hello. That puts
hello into the temporary string buffer. Calling
str.buf-addr str.print would print out
hello. Instead we call
svar hi, which secures enough memory to hold the string found in the temporary string buffer and copies the string, including its length, to the memory location secured by
svar. Following the assignment is an example of printing the string as well as printing the temporary string buffer (which still contains the same string). We then add a new string to the temporary string buffer and print it, then print
hi to show that they now differ. Outputting thevalue at
hi gives the length of the string. Lastly we output the value at hi+1, as a character via
emit, which will convert an integer to a character and output it. This yields
Strings are stored in memory as like so:
"hello" svar hi hi . ( Address: 920 ) hi @ ( Value: 5 ) - - - - - - - - - - - - Address: | 920 | 921 | 922 | 923 | 924 | 925 | Value: | 5 | 104 | 101 | 108 | 108 | 111 | 'h' 'e' 'l' 'l' 'o'
The words found in the
text module know to treat the first value as the length and the rest of them as characters. For example, calling:
108 emit would print
l. Mapping strings in this way allows you to get part of a string based on a simple offset value. The third character of the above example can be acquired very easily:
hi 3 + @. Simply add 3 to the address and get the value. In that light, arrays and strings can be thought of as 1 indexed, as opposed to the often more common 0 indexed.
Much like simple variables, strings also have getters and setters. Calling
hi get-string, referencing the above example, would copy the value of the string that hi references into the temporary string buffer. Moving in the other direction we can move a string from the temporary string buffer to a memory address with
"hola" 2300 set-string. That example would move the string 'hola' to memory address 2300. This is useful, but dangerous: you need to know that enough memory is writable at that spot to support the string. Otherwise you risk overwriting memory you might be using for something else. So, be careful. It is often useful to
allot more memory than you need via
var stringSpace 500 allot or the like. Then you can always overwrite up to 500 characters when assigning strings to the variable
stringSpace. You can use
allot to manage string memory in a more fine grained way and
svar when you have a string already and just want it moved into memory.
Local variables are the only kind of variable that can be created within a word definition. In fact, they can only be defined inside of a word definition. When the word finishes executing all of the words that it is composed of, the local variables will be cleared from memory automatically.
: local-example local x 5 x set x , space cr ( print the memory address of `x` ) x get ( print the value of `x` ) ; local-example x ,
In the above example we create a word
local-example. In that word we use the
local word to create a variable called
x. The value held at memory address
x is updated to
5 and that value is printed, along with the value of
x itself, which is a memory address. We cal the word after defining it. After it prints the memory address of
x it completes the word and clears the memory. When
x is referenced outside of the word,
x will not be found in the word dictionary and an error will be thrown (unless there is a global variable named
Some things to note:
- Local variables can have the same name as a global variable. The local one will always be used when both exist inside of a word.
- Local variables are only accessible in the word they reside in, sort of. The memory addresses they use will be available during the execution of the word containing the local, and thus all words called within that word will have the memory available as well. However, they do not have access to the variable name (
xin the above example). So if you want to use that memory in another word, simply put the value on the stack for the next word to opperate on.
- If a local is created within a loop, a new memory address will be assigned to that local variable at each loop itteration. So the naming will be overwritten. The memory that it previously occupied still exists until the end of the word though. This is a tricky quirk to utilize, but know that it is possible.
- You can still allot more space for a local variable (in order to, for example, store a string or array), but to do so you must use the
lallotword, rather than
allot, as they opperate on different pointers to the same memory space.
What would a programming language be without branching? Brnaching (
then) can only be used within word definitions and function as follows:
"text" inline "num" inline : mySubroutine ( n -- ) dup 10 > if . "is greater than 25" str.print-buf exit then dup num.positive? if . "is greater than 0" str.print-buf else . "is less than or equal to zero" str.print-buf then ; 50 mySubroutine ( Output: 50 is greater than 25 )
In the very contrived branching above we put 50 on the stack. We then compare 50 and 25 via the
> word, which will
-1 on the stack if 50 is greater than 25 or
0 if not.
if will branch based on the value on TOS. In this case it is truthy (
-1, or any value other than
0) so it enters the first branch and prints out
50 is greater than 25. Nesting can occur by adding a new conditional inside an
else. Remember to use
dup to duplicate the value on the top of the stack if you will want to use it beyond the conditional.
The branching in nimf is currently a little funky in its implementation. Deep nesting can often have unexpected results. This is actively being worked on. Guard clauses, such as the one above (the first
if where an
exit is used inside), are encouraged as a way to reduce code complexity. Not all situations will allow for using
exit, which leaves the current word immediately (similar to a return in a C based language, except that it doesnt return anything since the stack is a persistent structure). Using guards and exit avoids most of the current pitfalls with branching.
std library can be inlined to provide a number of conditional logic constructs including
0<, etc. The
num module also contains some useful items. In the above example
num.positive? was used to see if a number was greater than zero. You can, of course, just use
dup 0 > instead of
num.positive? but some words that use mostly symbols can be hard to remember and a clearly named word like
num.positive? can improve code readability should you need to come back to it at a later time.
Branching utilizes the return stack, so be careful when using the return stack inside of an
else block. Anything you put on the return stack should be taken off before the conditional segment you are in ends (so before
then if you are in the truthy segment and before
then if in the falsy segment). Care should also be taken when using the return stack in nested
Nimf currently only supports one type of loop.
do [...] loop.
do marks the beginning of a loop. The code within a do will always be run at least once, unless it is surrounded by an
if [...] then construct. The
loop keyword eats TOS and if the value is truthy will return to
do, otherwise the loop will end and execution will continue outside of the
do [...] loop construct. Like branching, loops can only be used within subroutines:
"std" inline : to100 ( n -- ) dup 100 <= if do dup . ++ dup 100 <= loop then drop ; 1 to100 ( 1 2 3 4 5 6 7 8 [...] )
In the above example we inline the
std lib (to gain access to
<=) and create a subroutine
to100. The subroutine first checks that TOS is less than or equal to 100, if not it just drops TOS and ends. If so it enters a loop, duplicates and outputs TOS, increments TOS, duplicates TOS and checks to see if it is still less than or equal to 100... if so it loops, if not it leaves the loop. It then drops top of stack.
This is a fairly basic example and shows simple looping and conditionals. More complex loops may require the use of counters or other stored information. For an example that uses variables look in
prints, which we used above for string printing (it actually needs some work to make sure previous variable states dont get overwritten. That improvement is coming soon TM)
Similar to branching, take care when using the return stack inside a loop. Anything you put on the return stack after
do should be taken off before reaching
Throwing an error can be done as follows:
: errorTest ( -- ) 1 2 + "Random error" error 5 * ;
errorTest above will result in 3 being added to TOS and an error message being thrown stating
Error: Random error, code execution will stop there. If you are running in interactive mode, the stack will be cleared and all operation flags reset. If you are running from a file, execution will cease and your program will exit with a non-0 exit code.
It is also possible to exit a program early without an error message via the
halt eats top of stack and exits the programing setting the value it received from top of stack as the exit code for the program.
nimf [options] [filepath]
Nimf can be run with our without a file as input. When a file is provided as input the interpreter will run the contents of the file without command prompts or interactivity beyond what was coded in the file and will exit when the file has completed (or an error will be displayed).
nimf without any filepath will launch nimf in interactive mode. The user will be presented a repl and will be able to input code and see results in real time.
Nimf works fine with shebang lines (ex.
#! /usr/bin/env nimf) and can thus run executable nimf scripts directly.
When invoking nimf in either interactive or file mode the following command line options are available:
-memory [int]The number of memory cells to run nimf with (default: 250000, min: 34999)
-stack-depth [int]The depth of the two stacks (data and return, default: 250, min: 1)
-run [string]Pass in a string of commands to run as a one liner, similar to python's -c flag
-hPrint command help and exit
-vPrint the version number and exit
-limit-ioRun the interpreter in a mode that does not allow filesystem access
-install-mod [string]Install a module to the local lib from a path or url; supports http(s), gemini, gopher, and local files. Note that this should be a single file, not a repo or directory
nimf -memory 50000 -stack-depth 335 ./my-file.nf nimf -run '1 2 + 3 4 * 5 .s'
By convention nimf files that are meant to be executed have the filetype
.nf (nimf file). If a file is meant to be inlined into another file and only contains variable declarations/allotment and word definitions it should end in
.nh (nimf header).
When using the
inline builtin you can often just use the name of the file you are wanting to inline. Nimf will search for the file in the following order:
- Your local directory:
- A lib directory in the local folder:
- The system lib folder:
If the filename you are inlining does not have a suffix, nimf will look for
[filename].nh. So when running:
The interpreter is looking for
std.nh in each of the three locations and loading the first match it finds. If a module is requested via
inline and nimf has already loaded it, the
inline command will be ignored (no extra searching or processing will be performed).
At present the repo does come with one example file. You can run it, from the nimf directory, like so:
make ./nimf ./examples/ascii.nf
You should see a nicely formatted table of the printable ascii characters appear on your screen.
Additionally, the module
gopher can be inlined in interactive mode to provide a minimalistic but usable gopher client interface:
"gopher" inline gopher.visit ( will query for host and path ) ( ... Prints the text file, parsed gopher map, or an error ) 5 gopher.follow ( will follow link #5 ) ( ... Prints the text file, parsed gopher map, or an error ) gopher.back ( will return to, and print, the previous page ) ( ... ) 6 gopher.url? ( will print the address that link 6 would take you to )
The gopher client uses the minimal TCP api available to nimf, more about which will be written in a future version of this document.
If you are a vim user, a syntax plugin for nimf is available here. It includes basic indentation rules as well as syntax highlighting for various structures.
If anyone wants to make an emacs or nano syntax that would be awesome. My text editor hermes (based on Kilo, by antirez) can be easily set up to highlight nimf syntax as well.