Re: CGI/1.0 --- what's wrong with the status quo?

Robert S. Thau (rst@ai.mit.edu)
Tue, 28 Dec 93 17:32:55 EST


> Well, you ask what is wrong with the status quo and then tell us about
> the modifications you have made to your server...

Perhaps I should have made it more clear that I was referring to the status
quo definition of the *protocol*. The changes I have made to the server
don't affect *that* in the least --- it is already possible to mix scripts
and ordinary files indiscriminately from the client's perspective with the
stock NCSA httpd using ScriptAlias, with none of my hacks at all.

> ...in order to get around
> one of the problems which wouldn't exist if either of the suggestions
> made by Charles Henrich and myself were adopted.

Again, the changes I've made to the server have absolutely *nothing* to do
with the way scripts get their parameters (which is what your suggestion,
and Charles', would affect). They have to do with the way that the daemon
finds the scripts in the first place --- in particular, which directories
it will search.

Perhaps it was confusing to discuss these two separate issues in the same
note, but I was trying to use them to argue the same point, namely, that
from a user's perspective, it is better for the server software to become
*more* flexible rather than less. (N.B. I'm counting script authors as
users in this context --- the author of a script is using the server).

> You are absolutely
> right that there is no reason that the script and coversheet should
> have to be in different directories. There is also no reason that
> directories containing scripts have to be listed in configuration
> files and processed on server start up.

I'm glad we agree about these, but...

> Or that scripts need to be
> distinguished from ordinary files by a naming convention which the
> server presumably decodes.

I've got two comments on this:

First off --- CGI/1.0 already has a naming convention which some people
find at least irksome, the 'nph-' business. Secondly --- if scripts and
ordinary files coexist in the same directories, and the server can't tell
them apart by the names, then how *can* it tell them apart? How is the
server to know whether to read the file or to run it?

I suppose one could do something with file permissions, but I honestly
prefer suffixing the name with '.doit'. The trouble with using permission
bits is that stray 'x' bits do occasionally get set on ordinary files.
With my server the way it is, this doesn't matter. On the other hand, if
the server were using the x bits to tell whether to run the file, and a
stray 'x' bit landed on some gateway's conversheet, the server would
wind up trying to exec() a file full of HTML, fail, and return a '500
Server Error' which *really* confuses the hell out of some poor novice.
("The file is there. Why can't the server read it?").

In short, the naming convention makes it obvious, simply by looking at a
file, whether it is a script which the server should run, or an ordinary
file which the server should just throw over the transom. From a *user's*
perspective, that's simplicitly --- even if it takes ten more lines of code
in the server. (This is not an exaggeration, BTW --- see below).

> Adding unnecessary complexity to the
> server is undesirable. You have now added more code to your server
> and you still don't have the functionality (much less the simplicity)
> that a very minor change in the protocol would give.

I'm not sure what you're getting at. What functionality don't I have?
Please be specific --- show me something I can't do. As to simplicity,
that's a matter of perspective. As the author of several scripts, I regard
your proposed changes as *adding* complexity, by giving me one more
inessential detail to keep track of. Granted, the server code does become
perhaps a little simpler, but see below for more on how I see the
tradeoff...

In any case, the amount of code I have added to the server is *minimal* ---
the total number of lines changed or added is well under 200. If I deleted
all of the code related to ScriptAlias (which I no longer actually use), I
think the server would actually shrink substantially.

> All I am saying is SIMPLE IS GOOD. Unnecessary complexity
> is bad.

I suppose most people would agree with this in the abstract --- until you
get around to the tricky issues of what exactly is "complexity", and what
is "necessary", from whose perspective. In particular, as I've said, you
are proposing to *add* complexity from the perspective of the script writer
--- in terms of requiring a fixed form for the parameters of their scripts,
which is one more inessential detail to keep track of and get right --- in
order to keep *your* code simple and clean:

> We are talking about making
> the *servers* clean and simple and about making things clearer for human
> maintainers.

The simplicification is in whatever routine in the server identifies the
PATH_INFO parameters to a CGI script. In the distributed NCSA server, this
routine is 22 lines of code (get_path_info in http_script.c), two of which
are blank. In my version, it's 62 lines, but I can shrink it to 29 by
reverting to the original code's K&R brace style, and stripping out blank
lines and comments. (BTW, I'm counting these 33 lines of braces and
whitespace in the change count above. Also, BTW, the extra nine lines of
executable code here are the ones that add the '.doit' and '.nph' suffixes
before checking for the existence of the script --- the naming convention
mentioned above. We are not talking about an enormous amount of code to
implement *any* of this stuff).

The complication is in every CGI script that takes PATH_INFO. At my site,
that includes 'imagemap' (which may well be the single most used CGI script
anyplace), my info gateway, and several scripts which form a community
hotlist system which I'm playing around with, along with a few more minor
experiments.

In short, I'm not at all sure I can see the tradeoff the same way you do
once the scripts, and the documents which already have links to them, are
put into the balance.

> I would be interested in hearing from server writers, like Rob McCool, Tony
> Sanders and the CERN server author. Also the views of script writers
> would be valuable.

I've never written a whole server, but I have written several nontrivial
scripts. You've got my opinion...

rst