Subj : Re: How-to on writing an interpreter?
To   : comp.programming
From : Rob Thorpe
Date : Mon Jul 25 2005 01:26 pm

Flavius Vespasianus wrote:
> "Rob Thorpe" <robert.thorpe@antenova.com> wrote in
> news:1121712211.057742.195680@f14g2000cwb.googlegroups.com:
>
> > Flavius Vespasianus wrote:
> >> "C. Rebert" <cvrebert@gmail.com> wrote in
> >> news:1121485564.371721.318960 @g43g2000cwa.googlegroups.com:
>
> >> The basic process is to define the language and to define the
> >> intermediate structure to be interpreted. The latter will be symbol
> >> tables and trees/graphs representing the code to execute.  (OO
> >> languages, such as C++ are EXTREMELY well suited for such
> >> applications)
> >
> > Actually I can't think of anything less useful for contructing trees
> > than OO.  It can be done very simply using procedural programming.
> >
>
> Whooooooooooooah THERE.
>
> > Each node of the tree is a little struct like this: -
> >
> > struct node_t
> > {
> >   tree_t type; /* This is an enum describing what this tree node is */
> >   operand_t value; /* A union/struct storing data vals in some way,
> >                       or ptrs to them.  */
> >   struct node_t *left; /* next child node to the left */
> >   struct node_t *right; /* next child node to the right */
> > } node_t;
> >
>
> Then at some point you are going to have
>
>     switch (node->type)
>     {
>     case IMULTIPLY: ...
>     case IDIVIDE:   ...
>         .....
>     }
>
> Each time you create a new typ of note you have to update the switch
> statement.

Yes, for this kind of tree.

> A parse tree node could look something like this:
>
> class IF_STATEMENT : public STATEMENT
> {
>   LOGICALEXPRESSION *test ;
>
>   STATEMENT *true_branch ;
>   STATEMENT *false_branch ;
>
>   virtual STATEMENT *execute ()
>   {
>      if (test->evaluate ())
>        return true_branch ;
>      else
>        return false_branch ;
>   }
> } ;
>

Yes, you can do it like that.
You could also put the tree type, and all the operations on it in one
class, which may be neater in cases where it's simple, but more complex
where the tree is more complex.

As Jon has said, it is a trade off.  If you must make many type of node
it may be beneficial from the point of view of simplicity to
encapsulate each one.  But, if you must iterate over the nodes in the
tree it may be better to do it another way.  You may want to do this
before interpreting them.

> With OO programming the nodes are completely self-contained. You can can
> add and delete node types without having to modify any of the interpreter
> other than the parser.
>
> Much easier than what you suggested.

There are easy ways to do this with a small modification of the design
I suggested.

  In the interpreter we will build there will likely be some set of
fundamental operations, and some set of less fundamental operations.
We can deal with the fundemental operations using the switch statement
you mention earlier, these are the operations mentioned in tree_t.  The
other operations can be dealt with by having a tree_t type called
something like "function".  This operation takes left and right, it
considers "left" (or the node left) to be a function pointer.  It then
calls the function pointer with "right" as the function's argument.
This means each of the second level operations require no change to the
switch statement, the parser just puts the pointer to function "foo" in
the node it makes when it finds the word "foo".  This "two-level
method" is used in many interpreted languages.

We can go further than this and remove tree types altogether:

struct node_t
{
  void* func; /* Function to call to comprehend this node */
  operand_t value; /* A union/struct storing data vals in some way,
                      or ptrs to them.  */
  struct node_t *left; /* next child node to the left */
  struct node_t *right; /* next child node to the right */

} node_t;

Now the interpreting function calls the function func with the
arguments "value", "left" and "right", which deals with everything else
from there.  This method is more-or-less equivalent to the one you show
with the class.  This also removes the need for a switch-case
statement.  Though it's not very nice, for several reasons, for example
a function call is needed to interpret every tree node.
FWIW The biggest problem with the code I posted is that it creates a
binary tree.  Something like a lisp-list or an M-ary tree would be more
appropriate, to allow for situations with multiple operatnds.

.