Re: special entities: < > & and "

Steve Heaney (Steve.Heaney@delft.sgp.slb.com)
Wed, 6 Oct 1993 19:07:35 +0100


Kevin,

The requirement that < > and other characters be replaced by entity references
(in certain situations) comes from SGML and is to do with the way that an SGML
parser processes the text of an SGML file.

There are several "data types" which elements can have including:

#PCDATA - parsed character data. Parser needs to determine if it contains
any more markup.

CDATA - character data. All markup characters are ignored.

RCDATA - replacable character data. As CDATA except entity references
and character references are recognised.

EMPTY - element does not have any content.

Most of the elements in the HTML DTD will be declared to have content of
type #PCDATA. NCSA Mosaic may not have a problem with "reserved" characters
such as the <, >, " and & in these elements, but you can bet that an SGML
parser will choke on it.

Here starteth the sermon ...

That's what comes of using a browser to validate your markup :-)

Here endeth the sermon. Amen.

Steve.

------------------------------------------------------------------------
Steven Heaney

Schlumberger Geco-Prakla
Internet: heaney@delft.sgp.slb.com
------------------------------------------------------------------------