revised proposal for file upload from browser to server

Larry Masinter (masinter@parc.xerox.com)
Tue, 1 Nov 1994 17:12:05 PST


This is a (substantial) revision of Ernesto Nebel's original proposal
for 'file upload'; it followed discussions at the File Upload BOF at
the Chicago WWW, and various conversations with a large number of
individuals, including Jay Weber and Alan Schiffman at EIT, Ned Freed,
Keith Ball.

FILE TRANSMISSION FROM WORLD WIDE WEB BROWSERS TO SERVERS

I. Introduction
---------------

Currently, a World Wide Web server can get information from users with
HTML forms. These forms have proven useful in a wide variety of
applications in which input from the user is necessary. But this
capability is still greatly limited because HTML forms don't provide a
way for the user to submit files to the server. Service providers who
need to get files from the user have had to implement custom browsers.
(Examples of these custom browsers have appeared on the www-talk
mailing list.) To avoid the necessity for custom browsers and to make
WWW servers complete in their ability to get information from the
user, the WWW needs to provide a way for users to send files to
servers. Since user information is sent back to the server using HTML
forms, it is most logical to extend HTML forms to support file
submission.

II. HTML forms with file submission
-----------------------------------

The current draft HTML specification <URL:http://www.hal.com/%7Fconno
lly/html-spec/spyglass-19941014/html-19941014.txt.Z> defines eight
possible values for the attribute TYPE of an INPUT element: CHECKBOX,
HIDDEN, IMAGE, PASSWORD, RADIO, RESET, SUBMIT, TEXT.

In addition, it defines the default ENCTYPE attribute of the FORM
element to have the default type "application/x-www-form-urlencoded".

This proposal makes two changes:

1) add a FILE option for the TYPE attribute of INPUT
2) allow the ENCTYPE of a FORM to be "multipart/www-form"

(These changes might be considered independently, but are both
necessary for reasonable file upload).

The author of an HTML form who wants to request one or more files from
a user would write (for example):

<FORM ENCTYPE="multipart/www-form" ACTION="...url...">

File to process: <INPUT NAME="userfile1" TYPE="file">

The change to the HTML DTD is trivial: just one item added to the
entity "InputType", as follows:

.. (other elements) ...

<!ENTITY % InputType "(TEXT | PASSWORD | CHECKBOX |
RADIO | SUBMIT | RESET |
IMAGE | HIDDEN | FILE )">
<!ELEMENT INPUT - 0 EMPTY>
<!ATTLIST INPUT
TYPE %InputType TEXT
NAME CDATA #IMPLIED -- required for all but submit and reset
VALUE CDATA #IMPLIED
SRC %URI #IMPLIED -- for image inputs --
CHECKED (CHECKED) #IMPLIED
SIZE CDATA #IMPLIED --like NUMBERS,
but delimited with comma, not space
MAXLENGTH NUMBER #IMPLIED
ALIGN (top|middle|bottom) #IMPLIED
>

.. (other elements) ...

This is the minimal change requested. Other, larger changes to the
InputType entity might also be contemplated but are not part of this
proposal. For example, an INPUT element might usefully have an
attribute which identifies a set of acceptable media-types, e.g.,
<INPUT TYPE=FILE ACCEPT="image/gif, image/tiff" NAME="image1">.

III. Proposed implementation
----------------------------

The proposed implementation in WWW browsers is, when a INPUT tag of
type FILE is encountered, to show a a display of (previously selected)
file names, and a "Browse" button or selection method. Selecting the
"Browse" button would cause the browser to enter into a file selection
mode appropriate for the platform. Window-based browsers might pops up
a file selection window, for example. In such a file selection dialog,
the user would have the option of replacing a current selection,
adding a new file selection, etc. Browser implementors might choose
let the list of file names be manually edited.

When the user completes the form, and selects the SUBMIT element, the
browser should send the form data and the content of the selected
files. The encoding type "application/x-url-encoded" is inefficient
for efficiently sending large quantities of binary data. Thus, a (new)
media type, "multipart/www-form" is proposed as a way of efficiently
sending the values associated with a filled-out form from client to
server.

The media-type (MIME-type) "multipart/www-form-data" follows the rules
of all multipart MIME data streams as outlined in RFC 1521: a boundary
is selected that does not occur (with more than infinitessimal
probability) in any of the data. Each field of the form is sent, in
the order in which it occurs in the form, as a part of the multipart
stream. Each part identifies the INPUT name within the original HTML
form using a "Name: " attribute. Each part has an optional
Content-Type (which defaults to text/plain). File inputs should be
identified as either application/binary or the appropriate media type,
if known. If multiple files were selected, they should be transferred
together using the multipart/mixed format. The
"content-transfer-encoding" for each part should be "binary". File
inputs may optionally identify the file name using the
"Content-Description" header. Browers may optionally include a
Content-Length header both in the reply and in individual components;
the content-length is not intended as a replacement for the boundary
but just as a way forwarning the server of the amount of data coming.

On the server end, the ACTION might point to a HTTP URL that
implements the forms action via CGI. In such a case, the CGI program
would note that the content-type is multipart/www-form-data, parse the
various fields (checking for validity, writing the file data to local
files for subsequent processing, etc.).

IV. Backward compatibility issues
---------------------------------

While not necessary for successful adoption of an enhancement to the
current WWW form mechanism, it is useful to also plan for a migration
strategy: users with older browsers can still participate in file
upload dialogs, using a 'helper' application. Most current browers
that we have investigated, when given <INPUT TYPE=FILE>, will treat it
as <INPUT TYPE=TEXT> and give the user a text box. The user can type
in a file name into this text box. In addition, current browsers seem
to ignore the ENCTYPE parameter in the <FORM> element, and always
transmit the data as application/x-url-encoded.

Thus, the server CGI might be written in a way that would note that
the form data returned had content-type application/x-url-encoded
instead of multipart/www-form-data, and know that the user was using
a browser that didn't implement file upload.

In this case, rather than replying with a "text/html" response, the
CGI on the server could instead send back something that a 'helper'
application might process instead.

It would take the URL-encoded form data it got, identify which of the
fields actually should be substituted with its content, and send the
entire form back to the client, identified with a *new* MIME type:
"application/x-please-send-files". The data in
application/x-please-send-files contains all of the original form data
supplied, and the URL which the multipart/www-form-data should
actually be sent.

The simplest design for application/x-please-send-files would be:
the URL to actually send the data (one line)
the names of fields whose values should be replaced with files (space
separated, on one line)
all of the form data, as originally sent to the server.

The browser would need to be configured to process
application/x-please-send-files to launch a helper application.

The helper would read the form data, note which fields contained
'local file names' that needed to be replaced with their data content,
might itself prompt the user for changing or adding to the list of
files available, and then repackage the data & file contents in
multipart-www-form-data for retransmission back to the server.

The helper would generate the kind of data that a 'new' browser should
actually have sent in the first place, with the intention that the URL
to which it is sent corresponds to the original ACTION URL. The point
of this is that the server can use the *same* CGI to implement the
mechanism for dealing with both old and new browsers.

The helper need not display the form data, but *should* ensure that
the user actually be prompted about the suitability of sending the
files requested (this is to avoid a security problem with malicious
servers that ask for files that weren't actually promised by the
user.) It would be useful if the status of the transfer of the files
involved could be displayed.

V. Other considerations
------------------------

Compression:

This scheme doesn't address the possible compression of files. After
some consideration, it seemed that the optimization issues of file
compression were too complex to try to automatically have browsers
decide that files should be compressed. Many link-layer transport
mechanisms (e.g., high-speed modems) perform data compression over the
link, and optimizing for compression at this layer might not be
appropriate. It might be possible for browsers to optionally produce
a content-transfer-encoding of x-compress for file data, and for
servers to decompress the data before processing, if desired; this was
left out of the proposal, however.

Deferred file transmission:

In some situations, it might be advisable to have the server validate
various elements of the clients data (user name, account, etc.) before
actually preparing to receive the data. However, after some
consideration, it seemed best to require that servers that wish to do
this should implement this as a series of forms, where some of the
data elements that were previously validated might be sent back to the
client as 'hidden' fields. This puts the onus of maintaining the state
of a transaction only on those servers that wish to build a complex
application, while allowing those cases that have simple input needs
to be built simply.

Other choices for return transmission of binary data:

Various people have suggested using new mime top-level type
"aggregate", e.g., aggregate/mixed or a content-transfer-encoding of
"packet" to express indeterminate-length binary data, rather than
relying on the multipart-style boundaries. While I'm not opposed to
doing so, this would require additional design and standardization
work to get acceptance of "aggregate". On the other hand, the
'multipart' mechanisms are well established, trivial to implement on
both the sending client and receiving server, and as efficient as
other methods of dealing with multiple combinations of binary data.

Not overloading <INPUT>:

Various people have wondered about the advisability of overloading
'INPUT' for this function, rather than merely providing a different
type of FORM element. Among other considerations, the
migration strategy which is allowed when using <INPUT> is important.
In addition, the <INPUT> field *is* already overloaded to contain most
kinds of data input; rather than creating multiple kinds of <INPUT>
tags, it seems most reasonable to enhance <INPUT>. The 'type' of INPUT
is not the content-type of what is returned, but rather the
'widget-type'; i.e., it identifies the interaction style with the
user. The description here is carefully written to allow <INPUT
TYPE=FILE> to work for text browsers or audio-markup.

VI. Conclusion
--------------
The suggested implementation gives the client a lot of flexibility in
the number and types of files it can send to the server, it gives the
server control of the decision to accept the files, and it gives
servers a chance to interact with browsers which do not support INPUT
TYPE "file".

The change to the HTML DTD is very simple, but very powerful. It
enables a much greater variety of services to be implemented via the
World Wide Web than is currently possible due to the lack of a file
submission facility. This would be an extremely valuable addition to
the capabilities of the World Wide Web.