Re: Holding connections open: an immodest proposal

Dave Kristol (dmk@allegra.att.com)
Wed, 14 Sep 94 10:50:03 EDT


hallam@dxal18.cern.ch (HALLAM-BAKER Phillip) says:

[details of another proposal to address...]
> 1) Loading all data segments associated with an object (eg html + inline images)
>
> 2) Contiuous mode connection for realtime response.
>
>
> 1 is solved best through use of MIME multipart type. The browser does a request
> and gets back the complete object as a single document, inline images and all.
> This is currently being added to the library but slowly :-(
Yes, exactly. I contend this solution will take longer to deploy than
my proposal for the server to hold the connection open (for awhile).
I think it's more complex, and, when the MIME solution is deployed,
there will be compatibility problems until everyone upgrades.
>
> There are two ways of doing this :
>
> 1) The server sends back everything as a unit
> 2) The client requests the inline images separately.
>
> The Server is actually in the best position to know whether an image
> is specific to one html or shared by many. Thus let the user defide whether
True, but irrelevant, for two reasons:
1) Even if an image is multiply shared, I may only ask for one of
documents in which it's used.
2) I may already have the image cached in my client.
So only the client actually knows if it NEEDS the image.
> to run the mime packer on a file or not. If the images are zipped up all
> in a single fred.mime then they will always be sent together. This can also
> be done on the fly if a .mime is requested of a file only stored as .html,
> this is a server special though.
>
> The second method requires a slight chnge to the specs. Where we have at the
> moment
[discussion of a multiple-GET request, with the result being returned as
MIME multi-part]
Again, incompatible, with deployment problems, I believe.
>
> A second method of doing MGET is to permit wildcarding in a URL. For example
> it would be nice to be able to specify a hierarchy of directories as is
> possible under VMS.
How often do I really want to get a whole directory hierarchy? What
happens if intermediate points in the hierarchy are mapped by the
server to different files that are not in the actual file hierarchy?
>
> [hallam...]
> /hallam///
>
[more about this /// proposal deleted]
[discussion of copy command deleted]

> This implementation is a minimal one requirng no substantial changes to the
> architecture of the likes of Mosaic. To go to continous connection is rather
> more radical since the browser should be capable of receiving async messages.

I shrink from the "continuous connection" label. In any case, how does
that prevent a browser from taking async messages? (Where are they
coming from?) The browser isn't typically listening for a connection --
it initates them. If you're talking about things like window system
events, keeping a connection open has no effect on fielding such
asynchronous events, does it?

>
>
> 2 is really a second protocol even though it may be a superset of http. Ie we
> expect to use all the same specs except that content length is mandatory for
> every block sent. This allows for conferencing and MUD connections and is in
> practice a replacement for telnet.
I assume this "2" is what you call "continuous mode connection", not "request
all images separately".
>
> Even here it is not strictly necessary to allow multiple gets. A POST method
> with duplex transmission of MIME multipart messages would suffice. I suspect
> that a different metod (DUPLEX) is justified though.
>
>
> The original idea of HTTP was that you did a single send and a single receive
> to obtain the object you want. NNTP and FTP negotiation is pretty futile and
> the continous connection stuff is a real pain. We certainly do not need to do
> an FTP style second connection simply to provide MGET.

I endorse a single send and a single receive. I want to keep the
connection open "just a little bit longer" in case it proves to be
useful. It's a bit like those pay phones where you give your credit
card at the beginning, and then when you finish your first call, you
can hit a button to make a second, without having to give your card
number again.

>
> I sketched out a suggestion for extending the HTTP protocol to multiple
> transaction for the first conference. To sumarise :-
[summary elided]
[extols virtue of single-shot connection]
> But we do not want to go back to the system whereby to get a news article you
> have to send four commands and get back four responses and keep open a channel,
> blocking other users from using the system and using resources needlessly. FTPs
> throw you out whenever possible is pretty stupid.

Please reexamine my proposal and some facts.
1) After a TCP/IP connection closes, the host is obliged to tie up the
resources for about 2 minutes, in case some tardy packets arrive.
2) I proposed to keep a connection open on the order of tens of seconds
in case a new GET (or other method, for that matter) arrives.
3) Therefore my proposal holds resources out of use for no more than,
say, 25% longer than they would be anyway.
4) In the event that there was in fact an image URL in what the server
returned, and the client needs it, the already-open connection saves
connection setup time, slow-start time, and having a second (or more)
connection around lingering in TCP/IP's close-wait timeout when it
eventually gets closed.

[disparagement of pragmas deleted]
> Summary:
> Yes we want connectionless and continuous connection HTTP. Possibly the
> latter has a different name. But the spec is pretty much the same.

Let me offer some more thoughts about why "sustained connection" (I
think "continuous connection" is too strong and misleading) may be a
good thing beyond the familiar discussions. Stuff is coming along that
will require negotiation between client and server: security, and
payment for information.

Consider, even now, the WWW Basic security scheme. The server rejects
a client's request and demands authentication. The connection closes.
After the client gets name and password, it starts a new connection and
repeats the request. However, if the client has cached the name and
password, it can respond immediately (no delay to query the user).
With a sustained connection, the same transactions would take place,
but over the originally created connection.

If the negotiation paradigm -- reject request, query user, re-request --
becomes common, the cost of opening and closing connections becomes
more painful.

You feel a sustained connection is a major change to HTTP. I agree it's
a violation of the purity of the model, but in practical terms I think
it's a small change with potential benefits.

Dave Kristol