Re: Gopher+ Considered Harmful

Tim Berners-Lee (timbl@www3.cern.ch)
Fri, 11 Dec 92 15:18:01 +0100


> And I still don't. I have the feeling that it would be much easier to
> adapt HTTP to other (non-TCP) transport protocols if the size of an
> entity is given separately rather than computed from the entity itself
> (after all this nonsense is only necessary because TCP doesn't have a
> way to distinguish EOF from a broken connection). As I understand it
> your main objection is that under my proposal you will have to
> construct the necessary headers in a buffer first. I don't believe
> that this is that much of a hassle in today's computers -- it
> shouldn't be more than a couple of kilobytes even in extreme cases,
> which is peanuts even for a standard PC.

It is not the space to buffer the stuff in the average case which is a problem.

There are extreme cases: Long documents which spew out of format converters
piped into other format converters. These things wouyld blow the memory of a
server which we never like to do.

There is the cumulative effect of response times. Curerntly, almost all the W3
code is pipelines, so the reponse (click mouse to first character on screen) is
a function of the round trip delays and any real retrieval time. The moment you
put a buffer in to count bytes, you have to wait for the first until the last
is available. In the (frequent) case of many stages being involved in a
pipeline the response time does not in fact increase much, you just get a lot
of CPU from processors on the pipe line. Once you buffer it up, you are using
CPU from one processor at a time. You can't start displaying it until you've
parsed it and you can't parse it until you've read it and you can't read it
until the server has counted it and he can't even start to count it until all
the real work has been finished.

You will notice the difference immediately.

Piping things until EOF is so much faster. Can TCP really not tell the
difference between a remote connection close, and a broken connection? :-((
(APIs apart)

> An issue on which I don't have a strong opinion is whether we should
> represent line separators as CRLF in the header -- anyone else?
>

If you are going to be telnet-style, then CRLF it has to be.
My comment in the proposed spec

http://info.cern.ch/hypertext/WWW/Protocols/HTTP/HTTP2.html

was "...In particular, lines should be regarded as terminated by the Line Feed,
and the preceeding Carriage Return character ignored." under a note on
"tolerance".

Tim