URL escaping

George Phillips (phillips@cs.ubc.ca)
17 Nov 93 13:55 -0800


Ari says:
>??? I thought HTTP doc specifies how to escape illegal characters
>in URL?

Maybe this is something that should be clarified. From the point
of view of a browser, here's how I see it:

URL's are opaque, except:

you know how to parse the //host:port stuff
you understand "/" so you can do relative paths
you understand "#" for hopping into documents and "?" for searches

For certain schemes, URLs become even less opaque to the broswer.
For example, for "gopher:", you _must_ use % hex escaping because
the browser must decode the URL for use in gopher. This is probably
true for "file:", "news:" and other URLs as well -- the browser
must know about the escaping because it will do the decoding.

For "http:", it's different. The browser doesn't do the decoding
(except for some /#? stuff) and depends on the HTTP server to
give it 7-bit ascii encoded URLs. As long as it spits out
7-bit ascii, the encoding is completely up to the server.

Is this the way things are, or am I way off here?

-- George