Well, that's what Lou, Ari, and I are doing, setting the application level
proxy standard for the Web. We've defined how a client can speak HTTP to a
proxy server in order to interact (GET, POST, PUT...) with the Web without
losing any functionality on the client side. This is necessary for clients
behind firewalls, but is also useful when you want the proxy server to act
as a caching server for a site, minimizing Internet traffic to and from
that site. The client always speaks HTTP to the proxy server and results
are always returned via HTTP (actually in the same connection, usual
stuff). The proxy server in turn speaks HTTP, FTP, Gopher, WAIS, whatever,
in order to retrieve the actual data, but always returns the data as an
HTTP MIME message, doing MIME typing on the fly; The GET examples I posted
a few days ago were intended to show the client/server conversation.
>The proxy environment variables cause the URLs to be sent with
>the protocol, the others without (I never checked what the real
>difference is in the use of WWW_xxx_gateway and the proxy one,
>and I never understood why the protocol info was omitted around
Only four environment variables: http_proxy, gopher_proxy, ftp_proxy,
wais_proxy. All expect to be set to a full URL, for example in bourne
Actually, you can proxy news as well via the cern_httpd, but that's not
such a great idea. The environment variable names are different than the
old WWW_protocol_GATEWAY environment variables so that they don't get
confused with the old mechanism and allow sites using the older mechanism
to migrate smoothly to the new standard method. I think people mainly used
the old method as a WAIS gateway with the oddball double URLs.
The big difference between the old WWW_protocol_GATEWAY proxy and the new
standard, is that in the new method the client always sends a full URL (a
real URL, like what the user would see in the client, not a double URL) to
the proxy and the client and server only talk HTTP between themselves. This
was shown by the GET examples I posted. Since the proxy server, gets a full
URL from the client, the same proxy server can proxy requests for all
destination protocols (http://, ftp://, gopher://, wais://). Also, the
proxy simply sends along all of the metainformation fields, Accepts, etc.
from the client when the URL is for an HTTP server (http://). This way, as
the HTTP protocol support expands in our clients and servers to include
more metainformation and so on, your site proxy server doesn't have to be
upgraded. The proxy server is just that, a proxy between clients and
servers on the Internet.
>And while we're on it I'd really like to have some mechanisms to only
>use a gateway at all, if the clients cannot connect directly. (IMHO
>it doesn't make too much sense to connect to servers on the same
>subnet/domain, that are e.g. on the same side of a firewall through
>a gatway server.)
Each client application will have to decide when to proxy and when not to.
A few messages went across this list about standardizing how clients should
make that decision, but we need more discussion. For now, all clients use
all or nothing proxying on a protocol by protocol basis.
>But before we have 20 environment variables to control the client
>and 2 different URLs that are sent out on gateway requests, could we
>please agree on some "standards" ?
I need to write up all the HTML to document this and get it posted on the
CERN server, etc. However, that doesn't prevent people from running clients
via the cern_httpd as a proxy, testing, and feeding results back to myself,
and this list. Proxy support is fairly easy for clients; I think the record
to add support right now is five minutes, but it didn't take much longer
for the NCSA folks to add proxy support given Lou's diffs. Server writers
that want to add proxy support should contact me if they don't understand
any of the workings described so far.