Newsgroups: comp.text.sgml
Path: utzoo!utgpu!watserv1!maytag!csg.waterloo.edu!garyp
From: garyp@csg.waterloo.edu (Gary Pianosi)
Subject: Record boundaries in SGML
Message-ID: <1990Nov21.210152.2631@maytag.waterloo.edu>
Sender: daemon@maytag.waterloo.edu (Admin)
Organization: University of Waterloo
Date: Wed, 21 Nov 90 21:01:52 GMT
Lines: 74


I am having some problems importing and exporting SGML documents using
SoftQuad Author/Editor v1.1.  For various reasons, I need to transfer
SGML documents between a Mac (where they are edited using Author/Editor),
and a PC (where they are manually edited to contain "rigorous markup").
As a result, I need the marked-up text to be both readable and correct.

I am able to import hand-edited files into Author/Editor without error,
but when I try to validate the document, I get the error message:

        "Validation Error:  Text not allowed here"

wherever there is a new line between an end tag and a start tag.
(I can delete these 'characters' to get rid of the error message, but
there are too many of them.)  Strangely, no error is reported for new
lines between adjacent start tags or end tags, even though text is not
valid between the tags.  I should mention that the hand-edited files
can be been validated using MARK-IT and the canonical output can be
read into Author/Editor without any problems.

On the other side, when I try to export a document from Author/Editor,
the resulting text file has new-lines only within text and after some
start tags.  Aside from being difficult to read, some lines exceed 500
characters making these files next to impossible to edit on the PC.

Finally, my questions ...

    -   What is the significance of record boundaries in SGML?

    -   Is the Author/Editor's treatment of newlines between end
        and start tags incorrect?  Or is the SGML standard open
        to interpretation?

    -   Is is possible to get Author/Editor to export a readable SGML
        document?  Or, do I have to run it through some other SGML parser
        to make it readable?

Below I have included a short SGML document that I believe is correctly
and rigorously marked up in accordance with the AAP Article DTD. It
produces the aforementioned error when imported into Author/Editor. Can
someone tell me whether this is a valid AAP Article document?

Any comments would be appreciated. Thanks.
---------------------------------------------------------------------
<ARTICLE>
<FM>
<TIG>
<ATL>
A Trivial Article
</ATL>
</TIG>
<AU><FNM>G.M.</FNM><SNM>Pianosi</SNM></AU>
<AU><FNM>P.D.</FNM><SNM>Jones</SNM>
<AFF>
<ONM>University of Waterloo</ONM>
<ODV>Computer Systems Group</ODV>
<PC>N2L 3G1</PC>
<CTY>Waterloo</CTY>
<CNY>Canada</CNY>
</AFF>
</AU>
</FM>
<BDY>
<SEC>
<ST>Introduction</ST>
<P>This paper...</P>
</SEC>
</BDY>
</ARTICLE>
---------------------------------------------------------------------
--
--
Internet:  garyp@csg.UWaterloo.CA            Bitnet:  garyp@watcsg.BITNET
Computer Systems Group, University of Waterloo, Waterloo, Ontario, Canada
