Crim: an extensible inner interpreter

Crim

In the middle of 1995 I started to work on a different inner interpreter for Forth that would let me code words that take immediate data, like <."> and LIT, in an easier way. I wanted that because I was trying to code in Forth some features from other languages (Icon's generators, co-expressions and continuations; regexps; backtracking; C's structs; etc) and the task seemed very difficult because there were lots of choices to be done right on the beginning, and each one of them seemed to affect too much the code coming after it.

I was being forced to choose some simple data structures and stick to them, but that was against the spirit of the problems I was trying to tackle...

I wanted my data to be as free-form as a Forth program, and so the obvious step was to blur the distinction between data and program. Most operations on complex data should be like executing it with the right inner interpreter. The inner interpreter would have to be VERY extensible.

I wanted to make code (and data) of my programs to look more like pseudo-code. I wanted it to be EXTREMELY short, to the point of being possible to show the hex dump of an entire complex program on a single text screen. I wanted to single-step and to debug the hex dump, not the source, as I knew that it would take months before I had a good syntax for the source. I wanted to search for the perfect bytecode, and then for the perfect optimizer for it; then I would try to get a good syntax, for I would have the tools to compile the syntax into a program: the translation from source to bytecode uses almost the same tools as the translation from bytecode to optimized bytecode. I didn't have those tools at that time, and I didn't like things like YACC -- too restrictive, too blackboxish, transforms its input too much, the transformation can't be broken into lots of useful small transformations, and uses C.

So, here are some notes on the very beginning of this project:

ETC
letter
announce

1.4th
1-autod.4th
1.aud

2.4th
2-autod.4th
2.aud

crimcomp.4th
crim.c
autodoc.4th
patchpfe

Note that they only show how to use a third stack (the "streams stack") and an inner interpreter with a variable number of states to implement immediate-data words easily. They are written in a clunky English with a zillion commas in excess. I haven't had the time to update them. Don't try to make the code run, PFE is almost dead and the comments are so much better than the code.

Some time after that I found a nice way to implement backtracking and to display the state of a backtracking interpreter with a nice 2D diagram. This would make top-down parsers easier to write and funnier to debug. I still don't have notes on it, and when I have the time I think I'll try to implement it in Tcl/Tk to make it accessible to a wider audience. (Ramblings:) We forthers are so few, we don't have a good free Forth for Linux, and I don't have any money left to spare... GForth is too ANSIsh and made for people who don't care for the internals or for the bytecode. I don't care too much for speed or standardness, I just want my mental orgasms back.

(History: wrote the notes in 1995. Showed them to some (2?) people. Thought about cleaning them and publishing them somewhere. Didn't. They stood on my disk for all these years. Shame on me. Wrote this page in 99mar08.)

Eduardo Ochs
My home page (very messy)