Newsgroups: comp.text
Path: utzoo!utgpu!watserv1!watdragon!watsol.waterloo.edu!tbray
From: tbray@watsol.waterloo.edu (Tim Bray)
Subject: Re: Public Domain Dictionary
Message-ID: <1991Jun23.221908.20234@watdragon.waterloo.edu>
Sender: news@watdragon.waterloo.edu (News Owner)
Organization: University of Waterloo
References: <91156.144339GONTER@awiwuw11.wu-wien.ac.at> <811@tivoli.UUCP> <ENAG.91Jun13023731@gyda.ifi.uio.no>
Date: Sun, 23 Jun 1991 22:19:08 GMT
Lines: 26

enag@ifi.uio.no (Erik Naggum) writes:
 They have used a portion of SGML's syntax, which, I'm sorry to say,
 does not make it SGML-conformant.  

It's not clear to me that an SGML-conformant dictionary is either necessary
or desirable.  A dictionary should be a small model of a human language.  
Not even SGML's strongest partisans claim for it an ability to model natural
language.

 As I heard the story, it was too
 many inconsistencies in the original material to try to make a DTD for
 OED2....
 What I've seen of the OED2 is not pretty, so they obviously had a very
 hard job figuring out how to encode it and actually do the encoding.

Indeed.  Frank Tompa of the New OED project at Waterloo, who has had a lot
of experience with online dictionaries, in co-operation with Bob Amsler, then
of Bellcore, now of Mitre, put in a lot of time and came up with a proposed
SGML def for dictionaries.  But it was tough, and even Tompa and Amsler
were left somewhat unsatisfied that they had covered the bases.

I have to disagree on the "not pretty" part.  Challenging, complex, somewhat
irregular, yes, all of those are true.  But this is no uglier than the 
English language that the OED is trying to describe.

Tim Bray, U of W Centre for the New OED -and- Open Text Systems
