Subj : Re: Converting .doc to .xml To : comp.programming From : Jordi Cuenca Date : Sat Oct 08 2005 09:25 am A very usefull address. I still do not know if it will be enought, but I am sure that I will find there interesting information about my issue ;) Thank you, Jongware. Jordi. [Jongware] wrote: > Jordi put this forward: > >>I am looking for an already developped tool that can convert Word .doc >>files (I do not mind the version) to .xml format. >> >>That tool should be a command line tool and I should be able to convert >>several files from .doc to .xml and, very important, I should be able to >>get the non-text data of the file (I mean, for example, author, last >>printing time and so on). >> >>The idea I have of it is somethink like: >> >>c:\> doc2xml [list of parameters] *.doc *.xml >> >>I've been trying with antiword but I did not succeed because it is asking >>me for a DTD file that I do not know from where to get it. > > > Antiword probably expects you to create the DTD yourself. If you want to use > XML you should also look into the DTD story. > The website http://www.xml.com/pub/a/2003/12/31/qa.html shows several > different approaches to Word -> XML conversion. Note that under the > 'comments' there is a sourceforge tool mentioned which can convert RTF; > easily exported from Word, including metadata. > > [jongware] > > .