Subj : Re: File Syntax checking: need logical help To : comp.programming From : Rob Thorpe Date : Mon Aug 01 2005 12:45 pm Abbacabba wrote: > I'm trying to create an efficient and manageable way to define a file > format and check it during processing. Our file formats are both fixed > record length, and variable record/delimited. > > I would like to be able to define a format, and have the program > generate the applicable rules. > > Given a generic data structure definition: (example only) > > A - manditory - A structure ends with H record > B - manditory - B structure ends with G Record > C - manditory - C Structure ends with D Record > C2 - Optional > C3 - Optional > D - Optional > E - Manditory > E1 - conditional - (E "packet" must have an E1, E1/E2, E2 or E3) > E2 - conditional - (E "packet" must have an E1, E1/E2, E2 or E3) > E3 - conditional - (E "packet" must have an E1, E1/E2, E2 or E3) > F - manditory > G - Manditory > H - Manditory > > What type of logic structure would be the best to use? > > I've looked at State machines and trees, but don't feel either(or my > implementation of either) comes across as flexable or efficient. > > > Example of logic: > > If the program has just finished with an "A" record the only possible > correct record type to process would be a "B". > > If the program has just finshed with a "C1" record, possible records > would be both a "C" or a "D" record. > > > I'm just needing some general ideas for a direction to pursue, not an > actual solution(although that would be most helpful). If your creating a file format that general it's probably worth parsing it formally, as though it were a computer language. That is, write a formal grammar for the language. Then either: - enter it into bison or yacc - build a recursive-descent parser to parse it. (see compiler textbooks for ways to do this, it's quite simple) It probably worth splitting it into two tasks:- 1. Understanding the file format and entering it into data structures to use (parsing) 2. Checking if a valid pattern of data structures is present at the end. (semantic checking) .