Re: Multiple objects in a single transaction: Making it more concrete

Marc VanHeyningen (mvanheyn@cs.indiana.edu)
Mon, 11 Apr 1994 14:59:51 -0500


Here are what I think the most salient issues involved with
encapsulating multiple transactions into a single transaction, per my
earlier discussions of creating types message/http-request and
message/http-response to enclose these into multipart constructs.

I'd appreciate lively feedback and discussion on these, particularly
from MIME-gurus who know more about What is Right than I do (too bad
there's no easy way to cross-post this to the MIME mailing list.) I'd
like to get together a registration for the new content-types if this
looks like the right way to go; I think having content-types for HTTP
requests and responses is overdue in any case and could be good for
things other than bundling them together for inlined images.

METHOD
------

When the client sends a multipart message containing many parts of
type message/http-request, what method should be used? GET hardly
seems appropriate. In a sense it is POSTing a message to be processed
by the server, so POST might be suitable. Some have suggested MGET;
this might be OK but suggests that all the enclosed requests must be
GET requests, which won't necessarily be the case though it's likely
to be typical. I used MULTICOMMAND in the example just off the top of
my head. A new method also has the advantage that a server can
clearly indicate its support by including that method in the Public:
header.

IMHO there's no need for this method to have a path associated with it
(Tony suggested using the path of the document containing the multiple
links being requested for accounting; I think the Referrer: header is
where this information should go) which suggests not using any
existing method, since all require paths. The first line might
therefore look like:

FOO HTTP/1.0

for whatever value of FOO we like.

HANDLING HTTP/0.9
-----------------

Before anyone shouts, keep in mind the fact that we're looking for a
general approach to this; for instance, a client may issue a bundled
request to a cache server, which in turn handles each of the requests,
some of which may involve working with old servers.

Requests of the old form, if there are any clients that still
generate them, are fairly easily handled by existing http-request
format since the first line will contain the version information. It
might be desirable to force them to end with a blank line, as HTTP/1.0
requests do.

Responses are a little harder, since the client doesn't get to know
"how much" of the original part was read. One approach would be to
force any HTTP/0.9 responses to begin with a blank line, to separate
the headers (of which there are none) from the body.

Should there be a version parameter set to either 0.9 or 1.0 (or
HTTP/0.9 or HTTP/1.0) appropriately, or is the implicit information
inside sufficient? My inclination is to include one, but I'm not
sure.

BINDING
-------

How can you tell which of the many message/http-response objects goes
with which of the message/http-request objects?

One approach is to simply use ordering; the first response goes with
the first request, and so on. This implicit association makes me feel
a bit uneasy. It would be nice to have a more flexible model. For
instance, I might send a bunch of requests to a cache server, which
then handles them in parallel and sends the results all back to me; it
might make it easier on it not to have to worry about ordering.

Another approach would be to associate a Content-ID with each of the
requests, and have each of the responses bear some indication of what
the Content-ID of the request being serviced was. The most
appropriate existing header for this seems to be In-Reply-To. Thus,
it might look sort of like this:

Content-Type: multipart/mixed; boundary=foo

--foo
Content-Type: message/http-request
Content-ID: <number1>

GET /foo HTTP/1.0
[ etc ]

--foo
Content-Type: message/http-request
Content-ID: <number2>

GET /bar HTTP/1.0
[ some headers ]

--foo--

with the multipart/mixed object returned structured like this:

Content-Type: multipart/mixed; boundary="bar"

--bar
Content-Type: message/http-response
In-Reply-To: <number1>

[ /foo data ]
--bar
Content-Type: message/http-response
In-Reply-To: <number2>

[ /bar data ]
--bar--

The important part is that, IMHO, the binding between request and
response should go on as part of the MIME headers making up multiple
parts, and NOT as part of the HTTP requests and responses themselves,
which should pass through the encapsulation totally unchanged.

My inclination is that both will have to be tolerated; i.e. clients
should use a header like In-Reply-To if it's available, and default to
implicit binding from the ordering if it's not.

HANDLING THE FIRST LINE
-----------------------

HTTP requests and responses are syntactically MIME messages, except
for the first line of both the request and the response. If there
were some MIME-ish way to separate this first line from the rest of
the data, the rest of it would be syntactically understandable by any
existing system that can handle message/rfc822 (which, as the primary
subtype of message, is what any MUA should treat unknown subtypes of
message as.) This would mean HTTP requests and responses would "do
the right thing" if passed to any MIME mail reader, while the first
line is not a syntactically legal header and might cause some readers
to barf.

There are a couple of ways to handle this, but neither of them are
at all clean:
- Make multipart/http-{request,response}, which contains two body
parts, the first line and the rest of the message.
- Put the first line into parameters (e.g.
Content-Type: http/request; method=GET; path=/foo/bar.gif; version="HTTP/1.0"

My inclination is that the advantage from going to this ugliness is
not sufficiently great, but I'm open to being persuaded otherwise.

CACHE SERVERS
-------------

I'm not entirely clear on exactly how cache servers work, but am
inclined to think the above approach could also allow a client to send
a composite request to a cache server, which would then service each
of the requests (which may or may not be to the same HTTP server, or
could be to another higher-level cache server in a hierarchy, or
whatever.) This also would allow other cache optimizations (e.g. a
cache server could send large numbers of HEAD or conditional GET
requests all at once to verify the currency of its cache more cheaply
than doing each one at a time.) Mirrors also might work more
efficiently (a mirror could GET everything, or a site being mirrored
could even PUT everything that has changed when updates occur.) Are
there any special considerations for cache servers and other
innovative HTTP variants not adequately addressed?

(BTW, as long as we're talking about cache servers, how do cache
servers handle negotiation of Content-Type?)

I'd appreciate feedback on this,
- Marc

--
Marc VanHeyningen  mvanheyn@cs.indiana.edu  MIME, RIPEM & HTTP spoken here