CGI suggestion

Marc Andreessen (marca@ncsa.uiuc.edu)
Tue, 28 Dec 93 08:31:47 -0600


John Franks writes:
> Now that I am seriously looking at implementing the CGI interface,
> I find one part problematic. This is the way that "state information"
> or arguments to a script get encoded in a URL as a sort of pseudo-path
> at the end.
>
> Here are my objections:
>
> 1. It is not possible to fully parse the URL without knowledge of the
> server's file hierarchy. For example, without knowing something about
> the file structure of the server I can't tell whether

Who is "I" in this context? If I == the server, then the server's
file hierarchy is in fact known. If I == some user, then it doesn't
matter one way or the other, does it (since the URL should be
considered opaque anyway)? I'm probably missing something...

Cheers,
Marc

>
> http://host.edu/foo1/foo2/foo3
>
> means script /foo1/foo2 with parameter foo3 or script /foo1 with
> parameter /foo2/foo3. I am not sure that there won't at some point be
> a need to get this information. Maybe not, but in any case this syntax is
> cumbersome to implement.
>
> 2. Assuming in the example above that the parameter is foo3 (or /foo3 ?)
> then the URL actually refers to two files: root/foo1/foo2 and, say,
> root/u/Web/foo3. Inexperienced users will find this confusing and
> expect to find an actual file root/foo1/foo2/foo3.
>
> 3. This syntax overloads the '/' token so it has very different meanings
> depending on context and does this in a situation where the context
> isn't readily visible. In my experience this is conducive to errors.
>
>
> SUGGESTION:
>
> I would like to make it a CGI *requirement* that the PATH_INFO data
> at the end of a URL contain an '=' and that this '=' be before the
> occurence of any '/' in this data.
>
> Here is what the example above might be like:
>
> /foo1/foo2/path=foo3
>
> Other legal and useful URL's might end like
>
> /foo1/foo2/param1=value1&param2=value2
>
> /foo1/foo2/path=foo3/foo4&path2=foo5
>
> URL's like this existing one from the xerox parc map server would be
> perfectly legal.
>
> http://pubweb.parc.xerox.com/map/color=1/ht=30/lat=38.8/lon=-96
>
> But I would encourage map/color=1&ht=30 etc. instead of using '/' as
> the separator. The main reason is that code to parse the '&' version
> should be common since it is necessary for forms.
>
> If the server knows that an '=' will occur at the begining of the
> PATH_INFO data, (and that any ='s in the actual path are URL encoded)
> then this information can be used to parse the URL without knowledge
> of the server filesystem. Also it is quite clear that expressions like
> foo1/foo2/path=foo3 refer to two files not one.
>
> The only significant change in the current CGI implementations that
> this would require is the PATH_TRANSLATED environment variable. I
> would suggest that this be replaced by a variable containing a
> directory name and then the script could create the translated path.
> For example if the URL ended in
>
> /foo1/foo2/file1=foo3&file2=foo4/foo5
>
> then the script could read the environment variable to get the directory,
> say, "/u/Web" and could reconstruct the file names /u/Web/foo3 and
> /u/Web/foo4/foo5. Notice that this allows more than one file name
> to be passed to the script which is not currently possible.
>
> One final minor suggestion. If the PATH_INFO data actually starts
> with '=' as the first character, I would have the server strip this
> character before putting the information in the environment variable.
> This would be convenient for very simple scripts that shouldn't have
> to do any parsing. Thus a URL ending in
>
> /foo1/foo2/=foo3/foo4
>
> would have PATH_INFO set to "foo3/foo4". You could also keep the
> PATH_TRANSLATED environment variable for this kind of URL and then
> almost no changes would be necessary in current scripts.
>
> What do you think?
>
>
> John Franks Dept of Math. Northwestern University
> john@math.nwu.edu