Subj : Parsing regular languages To : comp.programming From : Omri Barel Date : Wed Aug 03 2005 11:20 pm Hi, I was wondering about parsing regular languages. I know a regular grammar (right regular grammar) has production rules of the form: A -> a A -> aB A -> epsilon However, what happens if I have something that looks like a context free grammar, but the language it describes is regular? In my case it's even easier - the grammar is not recursive. So I have something like: S -> CDE S -> CDxF C -> abcG D -> m D -> n E -> xyz F -> rG G -> q As you can see, I can simply expand the entire thing, because it's not recursive: S -> abcqmxyz S -> abcqnxyz S -> abcqmxrq S -> abcqnxrq All very finite and simple. If each of the non-terminal has an action associated with it, I can keep that action in the expansion: S -> abcq[CG]m[D]xyz[E] S -> abcq[CG]n[D]xyz[E] S -> abcq[CG]m[D]xrq[FG] S -> abcq[CG]n[D]xrq[FG] However, the grammar is quite large, and I would like to avoid this expansion. I can always use a general context-free parsing algorithm (Earley's algorithm, for example), but I *know* the grammar describes a regular language, so I'm hoping I could do better than that. Is there anything I can do in this situation? I'm looking for a linear-time algorithm which doesn't take a lot of memory, and can use the grammar directly. Note that although there's no left-recursion problem (it's not recursive), I'm trying to avoid any left-factoring, due to memory requirements and processing time. So, if my grammar is: S -> a S -> a a S -> a a a S -> a a a a S -> a a a a a I don't want to convert it first into: S -> a A A -> a B | epsilon B -> a C | epsilon C -> a D | epsilon D -> a | epsilon And one more thing: the grammar is not stable - it's actually computer generated, so anything manual is completely out of the question. Thanks! P.S. If you think this question is too specific for this newsgroup, please let me know where I can post it. (And hopefully I don't have to say that this is NOT homework... ;-) .