Real-time movies and libwww

Tim Berners-Lee (timbl@www3.cern.ch)
Wed, 15 Jun 94 10:30:45 +0200


David Berger (Yokozuna) <dvberger@snake.CS.Berkeley.EDU> wrote,

> I'm a graduate student at University of California at Berkeley in the
> department of Computer Science working in the multimedia group. As
> many of you are aware, we have developed and released the CMPlayer
> which plays MPEG movies (both audio and video) across a network.
>

> We are interested in interfacing the CMPlayer with NCSA Mosaic. In
> some of the preliminary work, we've studied HTTP and, of course,
> surveyed similar efforts across the Internet.

>

> We see a problem in either Mosaic or HTTP. In Mosaic, an object that
> is fetched by clicking on a hyperlink is done so synchronously, e.g.
> control does not return to the client until the entire transaction is
> completed.

I agree completely.

1. The NCSA's Mosaic product was based on CERN's libwww, but NCSA did not
new versions of libwww after version 1.x. Since then both NCSA's branch
and CERN's have evolved. We have it high on our agenda to merge the
two branches again. Most of NCSA's enhancements and many others have
already been folded into libwww. So we'd like you to work with us
to incorporate real-time abilities into libwww for use by all clients,
not just NCSA's.

> A simple example is that you don't see your HTML page
> displayed until the entire page has been received.

2. The libwww internal structure uses a "stream" concept. A stream is
a write-only thing. Data arriving
over the net (etc) is pushed into a stream which is typically a
parser, reformatter, etc, which often pushes data into another
stream, until it is eventually stored, presented, or passed across
the net again. There is therefore no buffering of the entire
object between stages of the pipeline. Each stage is a state machine
which only carries the minimum state and look-ahead necessary for parsing.
You notice this pipelining with the line mode broser
that it will display the first characters on the screen the moment they
arrive, which gives a faster response time than you see with Mosaic.
It is only a feature of Mosaic that you don't see the document until it is
finished. (It's easier that way, and in probably faster to get to the
end of the document if you don't have to keep the screen refreshed)
The libwww pipeline architecure also makes it easy to hang for example
a movie player on of course.

3. Henrik Nielsen is working here on libwww and making it multithreaded,
so that clients can do other things with o ther windows, other
requests, while a documentis being processed. This would obviously
useful if one is to be able to work while watching the football.
This is for the release after next of libwww, but much of the
groundwork is already in.

> With an MPEG
> movie, you cannot watch it until the entire movie is on your local
> machine. We feel that this is bad for a variety of reasons.

You bet.
First,
> useful work can be done with partial data. Second, you run the risk
> of not being able to fit a very large object in your storage
> hierarchy.

Sure. Of course the counter reason is that you may not be
able to get the MPEG fast enough to play on the fly
-- often people can't even
get sound fast enough, and have to wait a minute for a 10 sec
sound byte. So one needs a choice depending on the
storage vs. bandwidth availability.

> Clearly this last point is somewhat forward looking;
> however we already see several megabyte mpeg movies available.

Oh, there are plenty of movies. It is the bamndwidth we lack.
But in certain cases the bandwidth you can gurantee, so
I'd like to see this option plugged in now.

> My question: is the "synchronous" nature of Mosaic because HTTP is
> meant to be implemented this way

Nope

> or just a consequence of the way this
> web browser was implemented?

Yup

> On page 2 of the draft HTTP
> specification, you see the four steps of a transaction, e.g.
> connection/request/response/close. There appears to be no reason that
> the browser cannot begin to do useful work during the response step.
> Has an intent been clearly specified for what is supposed to happen or
> is it open to interpretation?

Open. In fact, Phil Hallm-baker here has already made a IRC
protocol module which gives you a text document which is

a continuously incrementing record of an IRC chat session.
There are lot sof other cases in which you would want to
fork of a stream, like a group edit sesssion on a document, etc.

> Obviously, we advocate that work begin during the response step and a
> pipelined method of using data be used.

libwww is totally pipeline oriented.

> With respect to our work, we have two approaches that we will be
> investigating:
>

> 1)As soon as data begins arriving, an external viewer can be spawned
> to use the data. This would require modifying Mosaic itself.
> (essentially what is described above)

That is no problem at all. We'll make it a configuration
option so it can be programmed in or specified in a
client config file. It means putting a popen() into the
code which is a security hole of course but that is inherent
in the launching of an external viewer.

> 2)Have the web server spawn a process which deals with the
> communication and sending of the information to a similar process on
> the client machine.

You would maybe want to do this if you want to run a protocol
other than TCP/IP (maybe RTP or something) for the connection,
or if your operating sytem won't let the server or client
hand over the existing TCP/IP connection to the new process.

> We welcome all comments on this work and suggestions,

We welcome the work! If you could please tie in with us
to make sure we put in any hooks you need.

Tim Berners-Lee
WWW, CERN