Re: Performance analysis questions

Andrew Payne (payne@n8kei.tiac.net)
Sun, 15 May 1994 09:56:32 -0400


Daniel Connolly writes:

>HTTP is not Internet Mail. HTTP is a protocol based on a reliable byte
>stream, such as TCP. A reliable byte stream does not munge
>whitespace. It doesn't lose characters because it translated to EBCDIC
>and back.
>HTTP is not for the human eye: it's for a piece of software that groks
>TCP (or perhaps some other reliable transport eventually...).

I agree 100%, and this raises an important point for implementors: "be
conservative in what you send, but liberal in what you accept."

Specifically, implementations should NOT assume that the far end is
strictly conforming to the standard. Failure to do so can result in
nasty surprises and *security holes*. This is a good place to invest
in defensive programming.

As an example, here's NCSA's HTTPD code for parsing '%' escaped sequences
in URLs and arguments (sorry to pick on you again, Rob):

for(x=0,y=0;url[y];++x,++y) {
if((url[x] = url[y]) == '%') {
url[x] = x2c(&url[y+1]);
y+=2;
}
}

This code assumes that the '%' is always followed by two characters. If it
isn't (i.e. the '%' is one of the last two characters in the string), the
loop will jump past the null at the end of the string. The result is a
garbage error message, a core dump, or worse, the beginnings of a
stack-overwriting security hole. Try typing "GET %" at your server
implementation.

>It is not the case that there are 1000s of broken HTTP implementations
>out there that we need to support. There are perhaps 10 or 20, with 2
>or 3 represending 99% of the traffic.
>Let us keep the HTTP protocol clear and free of such kludgery.

One of the best ways to enforce standard implementations is to have a good
test suite that checks for conformance.

-andy