Re: partial URLs ? (was

William C. Cheng (william@cs.columbia.edu)
Wed, 20 Dec 1995 19:33:15 -0500


| > I like Dan Connolly's response that a well-behaved Client should NOT
| > request any URL with ../ in it because it may get a 403 response.
|
| I don't like that argument (and I didn't see it from Dan) - it's very
| Unix-centric, and doesn't generalize. After all, if you can't use some
| string in a URL because it MAY get a 403 response, then I can add a
| single line to my server config that would imply you shouldn't use any
| text string in a URL.

The point is that you CAN use the string in a URL, the browser should
do some processing before sending it to a server (just like a browser
should replace "&" by "&" before sending it to the server (as discussed
in another message).

| What behavior did Dan (or you) recommend if I type in a URL with a
| "../" in it by hand? Not doing what the user asked you to to avoid
| vague security problems on someone else's machine is pretty clearly
| broken. Escaping the URL is acceptable, and might even produce the
| correct results.

Dan's message to www-html is included at the end. I think he is
suggesting that a browser processes the "../" by collapsing the right
thing. At the end of his message, seems to me that he is also suggesting
that may be it should go into the HTTP spec.

--
Bill Cheng // Guest at Columbia Unversity Computer Science Department
william@CS.COLUMBIA.EDU      ...!{uunet|ucbvax}!cs.columbia.edu!william
WWW Home Page: <URL:http://www.cs.columbia.edu/~william>

(Sorry if you have seen this already.) --------------------------> included message <-------------------------- Resent-Message-Id: <199512201539.KAA27847@www19.w3.org> Message-Id: <m0tSQNg-0002S3C@beach.w3.org> To: Jon Wallis <j.wallis@wlv.ac.uk> Cc: BearHeart/Bill Weinman <BearHeart@bearnet.com>, www-html@w3.org Cc: http-wg@cuckoo.hpl.hp.com Subject: Re: partial URLs ? (was <p> ... </p>) In-Reply-To: Your message of "Wed, 20 Dec 1995 11:31:57 GMT." <m0tSMkY-000oANC@ccug.wlv.ac.uk> Mime-Version: 1.0 Content-Id: <6337.819473076.1@beach.w3.org> Date: Wed, 20 Dec 1995 10:24:36 -0500 From: "Daniel W. Connolly" <connolly@beach.w3.org> Resent-From: www-html@w3.org X-Mailing-List: <www-html@w3.org> archive/latest/2018 X-Loop: www-html@w3.org Sender: www-html-request@w3.org Resent-Sender: www-html-request@w3.org Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Length: 2397

In message <m0tSMkY-000oANC@ccug.wlv.ac.uk>, Jon Wallis writes: >At 13:19 19/12/95 -0600, BearHeart/Bill Weinman wrote: >> >>At 10:40 am 12/19/95 -0800, Walter Ian Kaye wrote: >>><A HREF="index.html"><IMG SRC="../gifs/btnhome3.gif" ALT="[Home]" >border=1></A> >>><A HREF="../map.html"><IMG SRC="../gifs/btnmap3.gif" ALT="[Index]" >> >>>(I'm gonna be changing the form and cgi soon, btw, cuz Lynx doesn't like >>>partial URLs -- tho' Netscape handles this form perfectly.) >> >> The problem with the parial URLs may be the "../" references. >> >> Some servers, and perhaps some browsers too, disallow them because >>they've been abused to get around security measures. > >That really shouldn't be a problem if the system is set up right - but since >so many systems are poorly set up in terms of security I can believe it.

I think there are two issues that are getting confused here: (1) whether it's OK to use ../../ in an HREF or SRC attribute in an HTML document, (2) whether it's OK to _send_ ../../ in the path field of and HTTP request.

(1) is cool, (2) is not.

For example, if the example above was fetched from http://www.foo.com/a/b/c.html, then to fetch the [Home] image, the client must combine the value of the HREF attribute with the base URL as per RFC1808, yielding:

http://www.foo.com/a/gifs/btnhome3.gif

To access the resource at that address, it makes a TCP connection to port 80 of www.foo.com, and sends:

GET /a/gifs/btnhome3.gif HTTP/1.0 Accept: image/*

What's _not_ cool is to try to sidestep the processing of .. on the client side; that is, to just combine the base and HREF into:

http://www.foo.com/a/b/../gifs/btnhome3.gifs

(which is _not_ a well-formed HTTP url) and send:

GET /a/b/../gifs/btnhome3.gif HTTP/1.0

This is illegal because it is a potential secruity risk. Consider a server whose document root is /usr/local/etc/httpd/docs/ and a client who sends:

GET /../../../../etc/passwd HTTP/1.0 Accept: text/plain

a naive server implementation might just do: fopen("/usr/local/etc/httpd/docs//../../../../etc/passwd") and give away a bunch of sensitive info.

In stead, any server that sees /../ in the HTTP path is supposed to issue a 403 Unauthorized response. (Is this in the HTTP specs somewhere? YIKES! I can't find it in draft-ietf-http-v10-spec-02.txt!!!

HTTP-WG folks: this should be addressed in the HTTP 1.0 spec, no?

Dan