https://laemeur.sdf.org/words/D29.html
- Return to Essay Index
Title: Rich Text, Poor Text
Author: Adam Moore (LAEMEUR)
Date: February 9, 2013
Revisited: January 17, 2014
Rich Text, Poor Text
Bold, italic, [subscript], ^superscript, underlines, [S:
strike-throughs:S] -- I don't find any of these presentational
attributes of text any more frivolous than quotation marks and
exclamation points. I mean, really, if the goal was to be starkly
minimalistic about it, we could write prose for electronic
transmission with letters, spaces and line-breaks, and throw-out all
the explicit markup. We don't call it "markup" when it's been around
for more than fifty years, we call it punctuation instead, but it's
the same thing.
THE WRITTEN WORD HAS QUITE A FORCEFUL APPEARANCE THIS WAY
ALTHOUGH AS STATED IN MY LAST ESSAY IT IS SOMEWHAT LACKING IN NUANCE
IT LACKS ALSO FLEXIBILITY
ONE MUST STRUCTURE THEIR STATEMENTS WITH CONSIDERABLY MORE CLARITY
WHEN THEY DO NOT HAVE THE CRUTCH OF COMMAS AND PARENTHESIS AND OTHER
DELIMITERS TO LEAN UPON
But nobody wants to have to live in that world. We all recognize the
expressive doors opened-up by our little tool-box of commas and
asterisks and hyphens and slashes. And the utility of presentational
attributes like bold text and underlines for clarity and
expressiveness are no less appreciated. In fact, their availability
when composing text on the computer is now taken for granted -- to the
extent that I don't even have the option of composing plain-text
messages in Gmail anymore(1).
But there's a problem with the way these attributes are stored on the
computer. Back in the 1960s, when the American Standard Code for
Information Interchange was being worked-out and the decisions were
being made about what to encode in the meager 7-bit address space of
the code, there wasn't room enough to store additional presentational
information about each character, so that information was necessarily
left out. Only the basics made it in: letters, numbers, punctuation,
and some control codes.
The only way to get presentational information into your text was to
start embedding information about the information within the
information. That is to say that within a stream of bytes, some of
those bytes would represent a message, and some of those bytes would
represent how to present that message. ANSI did this near the
hardware level with escape sequences in the 1970s, and innumerable
schemes have been arrived-at for doing this with software throughout
the decades.
The problem with this approach is that it pollutes text-streams with
non-text information. Ted Nelson explains the larger implications of
this in his article, Embedded Markup Considered Harmful.
My further objection to using embedded markup for these
presentational attributes is that by omitting them from the character
coding scheme, they are denied as elements of language and, to use
some Nelsonian terminology, they are treated as packaging rather than
content.
I maintain that they are just as much language content as the
exclamation mark is.
In the 1980s, when Joe Becker proposed Unicode, a "wide-body ASCII",
to encompass all of the world's alphabets and syllabaries and
ideograms, no provision was made for encoding presentational
attributes. In fact, such "fancy-text" is explicitly unsupported in
the Unicode 88 proposal. I simply don't agree with this approach.
Unicode has since strayed from its aim of "fixed one-to-one
correspondence with characters of the world's writing systems" by
supporting multiple-character combinations to add diacritical marks
to a glyph -- not at all dissimilar from the method of ANSI
escape-sequences -- yet still it has no standardized coding for
ubiquitous, pan-lingual presentational conventions such as bold text.
Were it my world to command, I'd simply move the everything to a
32-bit coding scheme and reserve the top 8 bits for presentational
attributes. The lower 24 would remain for whatever 16.7 million
characters people can dream to fill-up the space with.
--L.
D29
Afterword
Looking back at this a year after writing it, it seems a little
...hasty? While I think conflating "markup" and "punctuation" was a
step too far, particularly in light of the fact that the former term
has accrued considerable connotative baggage in the last ~20 years, I
do think it's worth investigating the boundaries of orthography (or
graphology?) and style; is language just the marks we make, or can it
also be the way we make them?
--L.
E17
Notes
1. Someone must have complained, because this option has returned --L.
E17
Please enable JavaScript to view the comments powered by Disqus.
comments powered by Disqus (c)2013 - 2014 LAEMEUR. Most rights reserved.