Re: Who can express URL syntax with BNF

Stan Letovsky (letovsky-stan@CS.YALE.EDU)
Tue, 26 Apr 94 09:47:21 -0400


Subject: Re: Who can express URL syntax with BNF
From: "Daniel W. Connolly" <connolly@hal.com>
Date: Mon, 25 Apr 94 18:57:43 +0100
To: Multiple recipients of list <www-talk@www0.cern.ch>
---------
>
>I made some attempts to write a yacc grammar for URL's, but it wasn't
>a very valuable excercise... regular expression matching works pretty well;
>e.g.:
>
>$Word = '[^/=;?#]*';
>
>$scheme = $1 if s*^([A-Za-z0-9\.-]+):**; # @# syntax of scheme?
>$hostport = &unescape($1) if s*^//($Word)**;
>$fragment = &unescape($1) if s*#($Word)$**;
>$search = &unescape($1) if s*\?($Word)$**;
>$path = &unescape($_);

Minor question:
This looks like perl, but I can't quite parse the regexps.
Is this some variant perl dialect or alternate regexp syntax?

Major question: This reminds me of an issue I strumbled across
recently, about the possible coexistence of #label and ?query-string
in the same URL. I did some experiments with Mosaic 2.4 that
suggested it did not recognize both in the same URL (ignored
the label, I think, although it was ignoring labels in any
script results when relative URLs were used, so I am not
positive how it interprets this combination in all contexts.)
Your regexps do not suggest any exclusion between #label
and ?query; I can't tell if it imposes an order on them.
Does anyone know what the official (? is there such a thing?)
position is on the legality and syntax of combining #label
and ?query in one URL?

-Stan