Re: Image types and related issues [was: Re: filetype extensions]

Daniel W. Connolly (connolly@hal.com)
Tue, 10 May 1994 14:09:28 -0500


In message <94051019265103@cguv5.cgu.mcc.ac.uk>, Chris Lilley, Computer Graphic
s Unit writes:
>"Daniel W. Connolly" <connolly@hal.com> said:
>
>> "Assume, for the sake of argument, that this caching server implements
>> 100% intvertible translation between gif and tiff."
>
>I am willing to pretend that you said
>
> "Assume, for the sake of argument, that this caching server implements
> 100% intvertible translation between format A and format B."
>
>The particular example you cited (tiff to gif) had enough problems; the reverse
>process is guaranteed not to produce the same image. The information loss going
>from a 24 bit TIFF to an 8 bit (at best) GIF is no way reversible.

Yeah... what he said. I get into more trouble by hurrying...

>> The Accept: header already specifies things like how much it costs the clien
>t
>> to deal with the given format, and tolerance on how long it is willing to wa
>it
>> for a conversion.
>
>It does? You mean, in theory, or do actual clients and servers generate and use
>this information?

There's code in WWWLibrary2.15 to support this. I infer from that evidence
that the CERN httpd server implement all this foo. Clients? Dunno...

>> We just add one that says "my tolerance for information loss
>> is 1.0, i.e. no information loss is tolerable." For help icons and such, you
>> would set t=0.9 or so.
>
>OK, but you need to specify what exactly the different levels of quality mean.
>
>1.0 is clear enough ;-) and so is 0.0 - convert it any old way but give me som
>e
>sort of image.
>
>The meaning of the intermediate values needs to be defined. How does 0.7 diffi
>er
>from 0.4, exactly?

Let's see... the way I see it, only 0.0 and 1.0 have "universal"
meaning -- the other numbers can only be compared with other numbers
in the same request. For example, you might write:

Accept: image/gif; t=.9, image/jpeg; t=.8

to mean "If you've got lossy conversion to JPEG or to GIF, I'll take
JPEG, cuz I suspect that a lossy conversion to GIF involved dithering
or something rediculous like that, but JPEG compression is usually
pretty invisible to the eye."

Hmm.. the server perhaps shoul multipy .9 by some estimated data los
in the ->JPEG conversion, and .8 times the ->GIF conversion factor,
and compare the products. Anyway... the .8 and .9 shouldn't be used
outside this context.

>A further point; I assumed 1.0 to mean the exact same file as on the server.

Nope. (Of course I'm making this up as I go, but...) If you want the
same file, you gotta know the format it's stored in and ask for
that. Or ask for it by MD-5 signature, or something like that.

>>What if, however, you ask the original server for
>
>Accept: image/x-iris-rgb; t=1.0
>
>(Iris RGB is a lossless 24 bit image format BTW) and the server has TIFF
>available? It can do a conversion, and it can guarantee (in most cases) that the
>RGB value of each pixel is identical. But it is not the same file.

t=1.0 means "the same information", and clearly this is such a case.

>And another thing - suppose a server is configured to convert the TIFF to an
>Iris RGB, maybe cache it for a day, then delete it to prevent wasting disk space
>as multiple formats of the same image build up. A month later, I use the same
>URL and get the same conversion done. Fine. Now put a proxy in the way; it
>happens to be cacheing last months Iris RGB file. If I ask it for that file, and
>it asks the original server for the last modified and expires fields for
>foo.rgb.
>
>What should the server respond? The values for the original TIFF? A
>last-modified of NOW as it is building the file transparently, on the fly, as we
>speak?

For the original TIFF... and yes, this gets hairy. But we want it done
right, no?

>To sum up, I am saying that there is a complex interaction between a)
>transparent format negotiation and conversion, and b) cache coherency issues
>arising from proxy cacheing. These raise a whole host of issues that urgently
>need an interim solution and long term need to be elegantly sorted out and
>documented.

Well said.

>The issues do not seem to have been raised before till I started messing with
>them, but then I don't work in a computer graphics unit for nothing ;-)

You're not the first to worry...

>From "A Formalism for Internet Information References", aka
http://www.hal.com/%7Econnolly/drafts/formalism.html
$Id: formalism.html,v 1.1 1994/04/25 17:48:29 connolly Exp $

The technology to support the growing base of internet information
resources will only get more sophisticated as we attack to the problem
of large scale data reduction (resource discovery and navigation) and
as we employ the machine more and more to augment learning and the
matinenance of information. Formal techniques are necessary to reduce
the complexity of such technology.

><Side_issue>
>This cultural heritage also shows in the mime types. image/tiff conveys nothing,
></Side_issue>

You're darn right it conveys nothing -- that's why it's not a standard
MIME content type! The MIME guys do their homework. Don't bash them
until you do yours :-)

Dan