Re: Cache woes.

Daniel W. Connolly (connolly@hal.com)
Fri, 20 May 1994 10:11:18 -0500


In message <4385.9405201434@daniell.brunel.ac.uk>, Paul.Wain@brunel.ac.uk write
s:
>
>Hrm, okay, so in the future it should be okay. The cache in question was
>a CERN one, no idea what version, that was translating '=' to %3D which
>as Dan said is erm "NOT safe". That was the only thing I could see
>causing problems (BTW from what I am told - not had a chance to check -
>the BNF says that = should be escaped. Is that right?)

In some circumstances. Let me explain:

There are two purposes for the %XX construct:

(1) distinguish data from markup, e.g.
distinguish '/' as a pathname-consituent character
(as it might be on a mac) from '/' as a pathname-separator
character (as it in POSIX and the URI syntax).

(2) allow transmission of URIs through transports that
are only reliable for a subset of the 256 octets.

I believe (2) was originally actually a hack to represent
spaces in HREF attribute values, ala:
HREF=ftp://machost/dir/file%20with%20spaces
This is clearly bogus, since unquoted SGML attribute
values have a much more limited syntax, and the simple
way to represent the above is:
HREF="ftp://machost/dir/file with spaces"
What about URL's with " in them? SGML syntax includes:
HREF="ftp://machost/dir/file with &#34; in it"
So it is actually possible to represent an arbitrary
sequence of characters in an SGML attribute value.

But the (2) issue is still motivated by mail transport...

So ~ and %7E mean exactly the same thing. As long as the transport
is one in which ~ characters make it through OK, there's nothing
wrong with writing http://www.hal.com/~connolly/index.html, except
for the fact that some stupid implementation might copy that
into a mail message without changing the ~ to a %7E, and then
an ASCII/EBCDIC translation would munge the ~ char.

On the other hand, / and %7E do NOT mean the same thing. Nor
do = and %3D. %3D means "= as a data character", whereas plain =
means, for example:
ftp://host/dir/file;type=image

I hope that makes things clear.

By the way... these issues are the province of the URI working group,
whose discussion forum is the uri mailing list. Contact uri-request@bunyip.com.

Dan