Lab 7 - Doing Math in the Correct Order Introduction ============ You have certainly come a long way! That AVL tree was pretty tough. This time around, we are going to build a parse tree. Really, parse trees aren't too difficult to build, so long as you have a good set of pseudocode (from your class notes) and a good understanding of what trees look like in memory. In this lab you will be building a parse tree using our btree.h file. You will not use the AVL tree for this, as this is a different type of tree. The final product should be able to take in a mathematical expression in standard notation and compute the correct answer according to the standard order of operations. In order to do this, you will need a lexer. The lexer is actually the hard part of this, so I have provided that for you. Also, you'll need a healthy dose of tree iterators, which we wrote in class. I've included those in the guided part as well. Once you are done with this, you will have the beginnings of a small compiler! Add in a little code generation, and you're set to write your own programming language. So now, with that bit of a pep talk, let's dig in. The Lexer and Tree Iterators ============================ The first order of business is to create a lexer. A lexer is just a function or program which decomposes a string into lexical elements. That way you can reason about things like the order of operations in the tree, and not worry about how you handle multiple characters in a number, or decimal points, etc. I have written a very simple lexer which I want you to copy and try out. This one can decompose a string into double values and into operators. It also sets the precedence level according to the order of operations, where the higher the level, the deeper it goes in the tree. Let's start with the header file: --begin lexer.h-- 1 /* A simple lexer for mathematical statements */ 2 3 #ifndef LEXER_H 4 #define LEXER_H 5 #include 6 #include 7 8 struct LexicalElement { 9 enum {NUM, OP, UNKNOWN} type; //element type 10 std::string text; //element text 11 double val; //element numeric value 12 enum {MUL, DIV, MOD, ADD, SUB} op; //operator value 13 int p; //precedence level 14 }; 15 16 17 LexicalElement nextElement(std::istream &is); 18 #endif --end lexer.h-- As you can see, the interface is pretty straightforward. The LexicalElement structure contains all the information we need to process the statement one element at a time. The nextElement function returns the next lexical element from an istream. In our case, we will be using cin as the istream. The implementation file for this function follows: --begin lexer.cpp-- 1 #include "lexer.h" 2 #include 3 #include 4 #include 5 6 using namespace std; 7 8 //static helper functions 9 static bool isoperator(char c); 10 static void handleOperator(LexicalElement &l); 11 static void handleNumber(LexicalElement &l); 12 13 /* returns the next lexical element in is */ 14 LexicalElement nextElement(istream &is) { 15 LexicalElement result; //the resulting element 16 bool done = false; //indicates if we are done 17 bool decimal; //indicates if we have hit a decimal 18 char c; //the current character 19 20 result.type = LexicalElement::UNKNOWN; 21 22 //scan until we've encountered all the characters 23 while(!done) { 24 //peek at the character 25 c = cin.peek(); 26 27 //start a type if needed 28 if(result.type == LexicalElement::UNKNOWN) { 29 if(isdigit(c) || c == '.') { 30 result.type = LexicalElement::NUM; 31 decimal = false; //no decimal seen yet! 32 } else if (isoperator(c)) { 33 result.type = LexicalElement::OP; 34 } else if(isspace(c)) { 35 cin.get(); //consume spaces 36 } else { 37 done = true; 38 } 39 } 40 41 //consume characters (if needed) 42 if(result.type == LexicalElement::OP) { 43 if(isoperator(c)) { 44 c = cin.get(); 45 result.text += c; 46 } else { 47 done = true; 48 } 49 } else if(result.type == LexicalElement::NUM) { 50 if(c == '.' && !decimal) { 51 //we get just one decimal point! 52 decimal = true; 53 c = cin.get(); 54 result.text += c; 55 } else if(isdigit(c)) { 56 c = cin.get(); 57 result.text += c; 58 } else { 59 done = true; 60 } 61 } 62 } 63 64 if(result.type==LexicalElement::OP) 65 handleOperator(result); 66 else if(result.type==LexicalElement::NUM) 67 handleNumber(result); 68 69 return result; 70 } 71 72 73 //static helper functions 74 static bool 75 isoperator(char c) { 76 const char *operators = "*/%+-"; 77 78 while(*operators) { 79 if(*operators == c) 80 return true; 81 operators++; 82 } 83 84 return false; 85 } 86 87 88 static void 89 handleOperator(LexicalElement &l) { 90 //assign appropriate symbol and precedence 91 switch(l.text[0]) { 92 case '*': 93 l.op = LexicalElement::MUL; 94 l.p = 2; 95 break; 96 case '/': 97 l.op = LexicalElement::DIV; 98 l.p = 2; 99 break; 100 case '%': 101 l.op = LexicalElement::MOD; 102 l.p = 2; 103 break; 104 case '+': 105 l.op = LexicalElement::ADD; 106 l.p = 1; 107 break; 108 case '-': 109 l.p = 1; 110 break; 111 } 112 } 113 114 static void 115 handleNumber(LexicalElement &l) { 116 l.p = 3; 117 l.val = atof(l.text.c_str()); 118 } --end lexer.cpp-- Study this set of functions to get a feel for how the lexer works. Really it's all about state and consuming the right kinds of characters. We won't go into a detailed explanation of how it works here, largely because there should be enough comments in the code to show you how it does what it does. The final step is to build a little test application: --begin lexerTest.cpp-- 1 /* 2 * A small test program for the lexer. 3 */ 4 5 #include 6 #include "lexer.h" 7 8 9 using namespace std; 10 11 int main(void) { 12 LexicalElement l; 13 14 //run until the end of the line 15 while(cin.peek() != '\n') { 16 l = nextElement(cin); 17 18 //print according to the type 19 switch(l.type) { 20 case LexicalElement::NUM: 21 cout << "NUM " << l.val << " P: " << l.p << endl; 22 break; 23 case LexicalElement::OP: 24 cout << "OP " << l.text << ' '; 25 26 //render the op enum 27 switch(l.op) { 28 case LexicalElement::MUL: 29 cout << "MUL"; 30 break; 31 case LexicalElement::DIV: 32 cout << "DIV"; 33 break; 34 case LexicalElement::MOD: 35 cout << "MOD"; 36 break; 37 case LexicalElement::ADD: 38 cout << "ADD"; 39 break; 40 case LexicalElement::SUB: 41 cout << "SUB"; 42 break; 43 } 44 45 //display p value 46 cout << " P: " << l.p << endl; 47 break; 48 case LexicalElement::UNKNOWN: 49 cout << "UNKOWN" << endl; 50 break; 51 } 52 } 53 54 return 0; 55 } --end lexerTest.cpp-- This test driver reads in a line of text, and prints out a rendering of the lexical elements. A couple of sample runs follows: $ ./lexerTest 2+2 NUM 2 P: 3 OP + ADD P: 1 NUM 2 P: 3 $ ./lexerTest 3+.5 - 2000 + 6 /2.1 NUM 3 P: 3 OP + ADD P: 1 NUM 0.5 P: 3 OP - ADD P: 1 NUM 2000 P: 3 OP + ADD P: 1 NUM 6 P: 3 OP / DIV P: 2 NUM 2.1 P: 3 Study the test driver to make sure you understand how to use the lexer. Now all that remains is to do the tree iterators which go with btree.h. (You should have the btree.h file from lab6. Go ahead and copy that into your current directory). Here, we just use the treeItr.h file and the btreeTest.cpp file that we wrote in class. Here they are, just in case you didn't manage to type them all in. --begin treeItr.h-- 1 #include "btree.h" 2 3 4 //helper functions 5 template BinaryTree* 6 leftMost(BinaryTree *t) { 7 //protect against null trees 8 if(!t) 9 return NULL; 10 11 BinaryTree *left = t->getLeft(); 12 13 //go left iteratively 14 while(left) { 15 t = left; 16 left = t->getLeft(); 17 } 18 19 return t; 20 } 21 22 //and iterator which visits a tree in in-order traversal order. 23 template 24 class InOrderIterator { 25 public: 26 //constructor 27 InOrderIterator(BinaryTree *t) { 28 this->t = leftMost(t); 29 } 30 31 32 //prefix iterator 33 InOrderIterator &operator++() { 34 advance(); //go to the next node 35 return *this; 36 } 37 38 39 //postfix iterator 40 InOrderIterator operator++(int) { 41 InOrderIterator itr(*this); 42 advance(); 43 return itr; 44 } 45 46 47 //dereference operator 48 T operator*() { 49 return t->getData(); 50 } 51 52 53 //return true if we are past the end 54 bool eot() { 55 return !t; 56 } 57 58 private: 59 BinaryTree *t; //current node 60 61 void advance() { 62 //protect against the end 63 if(!t) return; 64 65 if(!t->getRight()) { 66 //go up 67 BinaryTree *p=t; 68 do { 69 t = p; 70 p = t->getParent(); 71 72 //if this was the root, we are done 73 if(!p) { 74 t = p; 75 return; 76 } 77 } while(p->getLeft() != t); 78 79 //move up 80 t = p; 81 } else { 82 //go the left most child of the right tree 83 t = leftMost(t->getRight()); 84 } 85 } 86 }; 87 88 89 90 //and iterator which visits a tree in pre-order traversal order. 91 template 92 class PreOrderIterator { 93 public: 94 //constructor 95 PreOrderIterator(BinaryTree *t) { 96 this->t = t; 97 } 98 99 100 //prefix iterator 101 PreOrderIterator &operator++() { 102 advance(); //go to the next node 103 return *this; 104 } 105 106 107 //postfix iterator 108 PreOrderIterator operator++(int) { 109 PreOrderIterator itr(*this); 110 advance(); 111 return itr; 112 } 113 114 115 //dereference operator 116 T operator*() { 117 return t->getData(); 118 } 119 120 121 //return true if we are past the end 122 bool eot() { 123 return !t; 124 } 125 126 private: 127 BinaryTree *t; //current node 128 129 void advance() { 130 //protect against the end 131 if(!t) return; 132 133 if(t->getLeft()) { 134 t = t->getLeft(); 135 } else { 136 //go up until we approach from left and can go right 137 BinaryTree *p = t; 138 do { 139 t = p; 140 p = t->getParent(); 141 } while(p && (p->getLeft() != t || !p->getRight())); 142 143 //done! (no more tree) 144 if(!p) { 145 t = p; 146 return; 147 } 148 t = p->getRight(); 149 } 150 } 151 }; -- end treeItr.h -- And now, write out this test driver, and give it all a spin to make sure your lexer and your tree iterators are working. -- begin btreeTest.cpp -- 1 /* 2 * This is a small test driver for our btree. It builds a naive 3 * BST. Behold the glories of degenerative cases! 4 */ 5 6 #include 7 #include "btree.h" 8 #include "treeItr.h" 9 10 using namespace std; 11 12 //an implementation of the bstInsert algorithm 13 void bstInsert(BinaryTree *t, int data) { 14 if(data <= t->getData()) { 15 //insert left 16 if(!t->getLeft()) 17 t->linkLeft(new BinaryTree(data)); 18 else 19 bstInsert(t->getLeft(), data); 20 } else { 21 //insert right 22 if(!t->getRight()) 23 t->linkRight(new BinaryTree(data)); 24 else 25 bstInsert(t->getRight(), data); 26 } 27 } 28 29 30 int main(void) { 31 BinaryTree *t=NULL; 32 int data; 33 34 do { 35 //prompt the user 36 cout << "Enter Number (-1 to exit): "; 37 38 //get the data 39 cin >> data; 40 41 if(data == -1) 42 continue; 43 44 if(!t) 45 t=new BinaryTree(data); 46 else 47 bstInsert(t, data); 48 } while(data != -1); 49 50 //print the tree 51 t->printTree(); 52 53 //do an inorder traversal 54 cout << "In order traversal: "; 55 for(InOrderIterator itr(t); !itr.eot(); itr++) { 56 cout << *itr << " "; 57 } 58 cout << endl; 59 60 61 //do an preorder traversal 62 cout << "preorder traversal: "; 63 for(PreOrderIterator itr(t); !itr.eot(); itr++) { 64 cout << *itr << " "; 65 } 66 cout << endl; 67 68 return 0; 69 } -- end btreeTest.cpp -- Parsing Expressions (Challenge Lab) =================================== Ok! Enough hand holding. Now is the time for you to write a parse tree application. You will put together the pieces of the app using my lexer and the pseudocode you have from class. Your application should allow the user to enter an expression, and then it will display the answer with correct order of operations. You don't have to worry about checking the correctness of what they type. For the purpose of the lab, you can assume the user will always give you a correct instruction. Sample output follows: ./infixCalc Expression: 2.5 + 2.5 Answer: 5 ./infixCalc Expression: 2.5*1.2 - 5*2 Answer: -7 Hints: 1.) Tree traversal is the key to evaluating the expressions. That and good function decomposition. For instance, I have a function called "apply" which takes an operator and 2 numbers and applies the correct operation. You may want to look at previous examples and what you get when you do a post order traversal. 2.) Try printing your parse tree out using the print function already present in the BinaryTree template. Make sure you are creating proper trees before you evaluate them. 3.) It is possible to evaluate the trees using only 3 variables. You will need to figure out which direction you rise from a child if you want to try to do this. 4.) Just relax. You have 95% of this spelled out in your notes. You can do this! Extra Credit ============ Using the lexer and the parsing notes we have from class, it wouldn't be too hard to take the next step and do error checking. If your program can spot syntax errors (invalid sequences of lexemes) then I will give you extra credit. The better it is at expressing where the error is, the higher your extra credit points will be! Enjoy!