Well, the message/rfc822 was just a (machine readable:) way
to stick the example message in the middle of my comments.
The multipart/alternative type says to a MIME user agent "Here are
several versions of the same info. Pick the richest one you can
display." So dumb (i.e. minimally MIME-capable) newsreaders will show
the text/plain version. Knowbots and smart newsreaders will recognize
text/x-html and parse it for URLs.
>You have to assume the DTD is the official HTML DTD, not some local
What official HTML DTD? The IETF draft version that just expired? Last
I checked, it had syntax errors that prevented it from parsing.
> this is what the browsers assume anyway.
Some approximation of it anyway...
> The issue has
>been muddied because the HTML DTD initially distributed didn't
>work well, leading to local fixes, and new stuff from HTML+ has leaked
>into browser functionality, necessitating local updates.
>Users want to use the full display ability of the
>browsers they use, and browser developers haven't waited
>on an official revision.
Like I said... there's no handy way to validate an HTML document...
:-) The only practical way is to try it out on all the browsers you
expect your consumers to use. It's just a sad fact that HTML doesn't
have a working formal definition.
>Browsers that read arbitrary DTDs are on their way. It seems to me
>that what you are pursuing (rightly) is a well defined set of info
>that should be accomodated by any DTD that claims to be useful
>for hypertext, along with another well defined set of info that
>is to be supplied in the MIME wrapper when the SGML instance
>is served. The first part might resolve into some set of
>"architectural forms," that is, attributes with #FIXED values
>that can be used in, even retrofitted to, any DTD that actually
>has the appropriate info (such as an <AUTHOR> tag).
>Do I read you correctly?
I think so. It's all a question of quality versus functionality,
specification versus deployment, namespaces and syntaxes ...
Also... I don't see MIME and SGML as exclusive in functionality:
they're both encodings for structured information. And if you want to
look at maturity, USENET has got to be the worlds most mature
MIME has got some syntactic nasties, but they're motivated by trying
to get GIF files through VMS mail gateways.... The real value of MIME
is the collective experience from the widespread deployment of
internet mail and news. If you know you're not sending your info
through such beasties, you want to use as little of the MIME syntax as
you can... unless you're designing a protocol on-the-fly like HTTP.
SGML has got even worse syntactic nasties, and we won't go into why.
The real value of SGML is the ability to say "this document conforms
to this structure..." But the SGML standard is so foggy on semantics,
it's barely useful for anything else. (Don't tell me it says "nothing"
about semantics: it has all sorts of inuendos about characters being
processed "before" other characters, invoking programs to "interpret"
notations, things being "ignored", "significant", etc.)
The real problem with SGML is the namespace issue: there's no
subclassing mechanism. Look at the loops HyTime goes through to map
some pretty straigh-forward notions onto SGML.
The important step is to define the semantics we're interested in, and
then find a representation in some widely supported formats (MIME and
SGML are handy...). For example: how do we identify documents such
that we can test them for equality? (why? for caching purposes)
I'm starting to ramble... More later.