URL syntax to return byte ranges from files

David Glazer (dglazer@best.com)
Tue, 14 Mar 1995 16:04:37 +0500


I'm working on a (commercial) project to allow random access into files
stored on the Web. We have it all working, using a simple CGI program that
takes a filename and byte range as arguments and returns the correct block
of data from the file. Before shipping, however, I'd like to get group
input into the syntax we choose, with an eye towards being cleanly
extensible and inter-operable.

Executive Summary:
Generalize support for WN's byte range syntax, ";bytes=<start>-<end>".
See http://hopf.math.nwu.edu/docs/range.html for more info.

All comments and feedback are welcome. In particular, if anyone knows of
any other already-deployed 'standards', or a more appropriate forum for
discussion, please say so (and/or forward this message).

Thanks for your time,
dG

David Glazer
dglazer@best.com

===========================================================================
Problem statement:
Define an URL syntax to return a specified byte range from a specified
file living on the server. Servers should be able to support the syntax
via either a built-in extension or a CGI script.

CGI tradeoffs:

Option 1: Use a CGI script
Pros: simple to implement - doesn't require any server code changes
provides the same syntax on all servers that support CGI
Cons: requires an extra system call for each block requested
requires the server to have CGI execution turned on
requires a copy of the CGI script in each served directory
(to avoid some scary security holes)
only works with physical files located on the server's filesystem

Option 2: Build support into the server
Pros: more efficient
easier to administer
leverages all server name-mapping and security features
Cons: requires agreement on syntax to be portable across servers
even after agreement, newly modified servers need to be deployed

Conclusion:
Both options make sense and should be supported, ideally with a single
URL syntax. Hopefully, the CGI version will become less and less
necessary as updated servers are rolled out.

Proposed Syntax:
Given a base URL that identifies a document, append a modifier string to
select a range. The syntax of that string is ";bytes=<start>-<end>",
where <start> and <end> are inclusive byte offsets. The base URL can
either be the normal document URL (for servers with built-in support) or
a CGI URL.

Example:
The file foo.doc is available via URL http://www.a.com/docs/foo.doc
We want to get 512 bytes from foo.doc, starting at offset 1024.
If the server has built-in support, the URL would be
http://www.a.com/docs/foo.doc;bytes=1024-1535
If using a CGI script (installed in foo.doc's directory), the URL would be
http://www.a.com/docs/my.cgi?foo.doc;bytes=1024-1535

Note on Syntax:
We considered several alternative syntaxes, such as:
http://www.a.com/docs/my.cgi?foo.doc;bytes=1024+512
http://www.a.com/docs/my.cgi?foo.doc+1024+1535
http://www.a.com/docs/foo.doc?1024+512

They differ mainly in punctuation and supplying a length instead of an
ending offset. I don't see any intrinsic benefit to most of the choices,
so the facts that WN is shipping with a syntax, and that that syntax can
also be cleanly used in CGI scripts, seem to carry the day.
===========================================================================