Subj : Re: Converting .doc to .xml To : comp.programming From : [Jongware] Date : Thu Oct 06 2005 05:13 pm Jordi put this forward: > I am looking for an already developped tool that can convert Word .doc > files (I do not mind the version) to .xml format. > > That tool should be a command line tool and I should be able to convert > several files from .doc to .xml and, very important, I should be able to > get the non-text data of the file (I mean, for example, author, last > printing time and so on). > > The idea I have of it is somethink like: > > c:\> doc2xml [list of parameters] *.doc *.xml > > I've been trying with antiword but I did not succeed because it is asking > me for a DTD file that I do not know from where to get it. Antiword probably expects you to create the DTD yourself. If you want to use XML you should also look into the DTD story. The website http://www.xml.com/pub/a/2003/12/31/qa.html shows several different approaches to Word -> XML conversion. Note that under the 'comments' there is a sourceforge tool mentioned which can convert RTF; easily exported from Word, including metadata. [jongware] .