Re: Processing instructions for style tweaks?

Philippe-Andre Prindeville (philipp@res.enst.fr)
Wed, 30 Nov 94 14:16:26 +0100


On Nov 30, 1:39, Wilfredo Sanchez wrote:
> Subject: Re: Processing instructions for style tweaks?

> Part of the whole motivation behind a content-based, rather than a
> formatting-based markup language is to leave as much of this
> responsibility to the browser, and the readers customizations to the
> browser. Part of the problem with the push that Netscape is making is
> that it deteriorates this purpose. Tags like <blink> are completely
> formatting and say nothing about the meaning of the content.
>
> So we don't want things like '12pt'. It's too restrictive.

I agree completely.

>
> I'm OK with the idea of giving the browser hints as to how to show
> things, so this might be a little smarter:
>
> <em size="+1">
>
> So the browser knows that for some unkown reason you want the font to
> become a bit bigger, *realative to the current size*. Maybe that means
> increasing by 1 point for +1, 2 points for +2, etc, or maybe that
> means something else. Exactly what the number means may depend, sort
> of like <Hn> implies different sizes on different browsers.

Yes. Almost. Size changes actually shouldn't be linear. For instance,
between 5 and 9 points, a 1 point size change is discernable. But
beyond 12 points, you would need a 2 point change to be really obvious.

I thought about being able to do vertical motions and size changes,
for subscripts, etc, and decided this was a bad idea. A subscript
should be a logical command, since often the subscript is *bound*
to something (you can thus think of them as a sort of binary operator).

> This is still not optimal. Better, because if I set my font size to
> 34pt by default (I'm viewing on my HDTV set and sitting far away),
> then you attempt to make things bigger by "increasing" the point to 18
> will loose, and "+1" will always mean bigger. But the problem remains
> that the browser does not know *why* you want it bigger.

Right. Or if I cut and paste Dan's example list into a footnote
in a report I'm writing, it will be grotesquely big compared to the
9 point text I used in the footnote.

> So let's be smarter. We have some smart tags like <Hn> to mean "this
> is a header, you probably should make it big, but maybe you have a
> better idea..."

(stuff deleted)

> And we can keep on going...
>
> Hmmm...
>
> Well, there's an awful lot of different things to add. Captions,
> chapters, footnotes, appedixes, poetry, ... They're all kind of
> different and a browser may want to render each such thing in it's own
> way. We run into a scalability problem:
>
> 1) The damn spec is getting huge

(This is not surprising, considering the scope and ambitition of
what we are trying to do)

> 2) A really good browser is REALLY hard to write

(If it were trivial, it wouldn't be worth doing ;-)

> 3) The poor document writer has to remeber all these crazy
> tags

Not necessarily. Tags are typically used in combinations. His
editor needs to be able to use these combinations then as "macros".
But what is the writer doing manipulating tags directly? His
editor should abstract all of this for him...

> So there's a good question... Do we expect the number of different
> styles to be finite and not too much to manage? If not, what the next
> smartest thing?

Finite, but quite large, I would say.

> Style sheets might be... I don't know.
>
> The problem is that Mosaic and other browsers are already out. The
> world is writing HTML in a broken way. So it's impossible to write a
> smart browser, because most documents will only confuse it, as the

Hang on, you're missing something here. Browsers aren't the only
consumers of HTML. Other applications, either intelligent agents
(like "knowbots"), or software that integrates text recovered from
the web into other forms of information, will need to eat up HTML.
And software is much more demanding that the text be well-formed,
since it doesn't have the intelligence or "fuziness" to accept
things that look "mostly ok".

(I converted an idiot tape of a well-known French dictionary to
SGML about a year ago. Most of the text was entered by hand and
not verified. The number of times the converted coredumped
because it saw something not quite expected.... was scary. I had
to build in heuristics to handle malformed constructs, and this
took a lot of time... more time than simply writing the converter
without these "kludges").

An example might be a dictionary or a thesaurus that is available
via the Web. Now if someone wanted to build in a front-end to his
word processor to be able to consult this dictionary, the dictionary
would have to be rigidly (well-) structured or else the front-end
would have to be very hairy...

> example with headers - you can't make a table of contents based on
> headers, becuase not everyone uses headers as headers. People do this
> instead:
> <h3>
> <ul>
> <li> ...
> </ul>
> </h3>
> That could be a gross item in your table of contents, I think.

Right. Now imagine a piece of software that generates a base of
inverted search keys from headers in text, attaching importance
to the words in the header according to the level of nesting (ie.
more importance for <h1> than for <h2>, etc)...

> It's somewhat like CD's, only worse. The CD data format has all these
> great features. You can label every track with a song title, author,
> and artist, you can index within a track, etc. But the CD players
> didn't use it, so nobody put the data onto the CD's, so no CD players
> will ever use it. So browsers format bad HTML, everyone writes bad
> HTML, and nobody with write a browser that doesn't format bad HTML.
>
> What to do?

Good question. Wish we all knew. But first off, stop thinking that
browsers (or even humans) are the sole consumers of Web data. They're
not. And they won't be.

> Wow that was long.
> Sorry.

Don't be. You raised some good points.

-Philip