Enhancing mailcaps for MIME and WWW

Marc VanHeyningen (mvanheyn@cs.indiana.edu)
Thu, 19 May 1994 17:38:07 -0500


Mailcaps have a great deal of promise as a mechanism for centralizing
configuration information about handling MIME content-types,
permitting a variety of programs to successfully handle new items
with a relative minimum of fuss. It is assumed that most programs
will have a (reasonably small) set of content-types they can handle
directly and will use this file to resolve others.

Of particular importance to me is the presence of MIME in the WWW
community, and the desirability of these two bodies of thought to try
to forge good things together rather than each do different and
incompatible things, as has happened in a few cases.

I believe both the mail-based MIME and WWW communities stand to
benefit from a unified content-type information file which is
somewhat, but not radically, richer than the present configuration.
I'd like to find some other people who are interested in doing this
and hammer out some issues. I'm more interested in finding people who
are interested in extending the idea towards this kind of
functionality than in specific criticisms about the half-baked ideas
that follow, though those are fine too.

Some ideas include:

QUALITY VALUES.

HTTP includes the concept of quality values in content-type
negotiation so that, in principle, servers can be aware of the
specific dispositions of the hardware, software, and user preferences.
MIME attempts something similar through the multipart/alternative
structure, but the preference is strictly set by how the sender
chooses to order the body parts without regard for the specifics of
the viewing environment.

Adding something like an optional "quality" field to a mailcap entry
could provide this sort of information. For example, consider the
following made-up mailcap excerpt:

text/plain; cat; copiousoutput; quality=1.0
application/postscript; ghostview %s; needsx11; quality=2.0
application/postscript; ps2ascii %s; copiousoutput; quality=0.1

Clearly whether PostScript is to be preferred over plain text given
the above configuration is dependent upon whether X11 is currently
running. More generally, what content-types are preferred is not
static.

Obviously, a WWW client could use the quality information in its
requests to reflect what form of information transfer is to be
preferred. A normal MIME reader, however, could also use this
information to choose which portion of a multipart/alternative
structure to present, possibly overriding the default of presenting
the last body part possible. This permits the specification of crude,
low-quality representations for complex objects for use in
impoverished environments without causing them to be preferred over
formats which can be directly displayed without loss. Composing MIME
agents also might use this information (without the test conditions,
of course) to automatically order alternative body-parts.

FILE RECOGNITION

In general, when composing MIME entities, few people (or
non-interactive programs) sit down and decide "I think I'll insert an
image/jpeg and an application/postscript." People are more likely to
think "I'll include this file I was working on which is named foo."

I have experimented with using the "nametemplate" field as a heuristic
for matching arbitrary filenames to content-types; I believe it
preferable to maintaining a separate configuration file mapping them
together. However, that's not really ever what nametemplate was
intended to do.

It would be a big win to be able to define some (obviously
OS-dependent) mechanisms for specifying file attributes typical for
various content-types; all software that composes MIME entities, from
mail programs to HTTP servers, might use this.

Consider the following made-up-off-the-top-of-my-head approach:

application/postscript; blah; description=a PostScript(tm) file; \
contenttest=ispostscript %s

Where the program "ispostscript" takes the filename of the prospective
body and returns with a zero exit status if that body is of type
application/postscript. The test program might just look at the
filename, might check for magic numbers in the file, or whatever.

Here's another approach (actually two of them combined):

text/plain; blah; matchfiles=*.txt
application/postscript; blah; description=a PostScript(tm) file; \
matchfiles=*.ps; magic=%!
image/gif; blah; description=a GIF file; matchfiles=*.gif; magic=GIF
audio/basic; blah; description=an 8-bit u-law mono 8000 Hz audio fragment; \
matchfiles=*.snd,*.au; magic=.snd#000000010000000100001E40#

where the # delimits hexadecimal information. Or something like that.

Along those general lines is the idea of providing additional
information to composing agents about making content-types. For
instance, a field indicating what parameters are mandatory for this
content-type could be useful.

--
<A HREF="http://www.cs.indiana.edu/hyplan/mvanheyn.html">Marc VanHeyningen</A>