<BASE> processing by browsers

Grzegorz Staniak (GSTANIAK@golem.umcs.lublin.pl)
Sat, 29 Apr 1995 15:28:07 +0500


Just a couple of thoughts as concerns the <BASE> tag.

I use the tag arbitrarily, i.e. do not always make it include the
actual URL of the document that contains the tag - sometimes the tag
points to a directory a level or two above the document, which allows
me to easily refer to other documents in neighbouring directories,
like this:

(for a URL like "http://my.www.server/foo/bar/my_document.html")

<BASE HREF="http://my.www.server/foo">
...
<A HREF="/bara/another_document.html">
<A HREF="/barb/yet_another.html">
<A HREF="/barc/and_one_more.html">

This way, there's no URL of the retrieved document anywhere in its
content; however, the links are put in a context and after saving and
opening locally such a file is still fully functional - every link works.

The problem is that a number of browsers know better what the value
of my <BASE> tag should be, i.e. they assume that it must contain the
URL of my document. Not surprisingly, while trying to follow any of
the links from the example above you're going to see the 404 "Not
Found" error messages, giving URLs like:

"http://my.www.server/foo/bar//bara/another_document.html",

which, I agree, do not exist.

My point is that this is a mistake on the side of browser developers.
There's nothing in the HTML 3.0 specs (or HTML 2.0 for that matter)
that would prevent the author from aribtrary use of the tag. If HTML
3.0 mentions that the default BASE is the URL of the document itself,
then talking of defaults only makes sense if you're allowed to
override them, doesn't it.

The internet draft on Relative Uniform Resource Locators, <draft-ietf-uri-
relative-url-06.txt> proposes another way of doing what I do: using
"." and ".." in the relative path, like:

<BASE HREF="http://my.www.server/foo/bar/my_document.html">
...
<A HREF="../bara/another_document.html">

but in section 3 "Establishing a base URL" it stresses that the base
URL embedded in the document's content should take priority, while
parsing, over any other way of establishing it.

I have a feeling that at a time multiple <BASE> tags were proposed,
to serve more or less the same function as my arbitrary <BASE>. It
seems this has been dropped since then.

Perhaps the issue is not that important, but more widespread use of
the <BASE> and relative URLs would make saved files more useful -
very often I save an interesting page only to see, after opening it
locally, that it's full of relative URLs but has no <BASE> tag, and
all the links are useless.

-------------------------------
Grzesiek Staniak
<%20processing%20by%20browsers"gstaniak@golem.umcs.lublin.pl>
<%20processing%20by%20browsers"gstaniak@galen.imw.lublin.pl>