Re: Proposition on advanced URL features (Is # illegal)?

Roy T. Fielding (fielding@avron.ics.uci.edu)
Thu, 30 Nov 1995 21:12:33 -0800


>> > 1. The use of ## for special anchors seems reasonable.
>>
>> Use of more than one "#" character is illegal and not desirable
>> in the current URI syntax.
>
> It's an interresting point here. Let's see this quote from RFC 1808
> (by R. Fielding):
>
> |2.4.1. Parsing the Fragment Identifier
> |
> | If the parse string contains a crosshatch "#" character, then the
> | substring after the first (left-most) crosshatch "#" and up to the
> | end of the parse string is the <fragment> identifier. If the
> | crosshatch is the last character, or no crosshatch is present, then
> | the fragment identifier is empty. The matched substring, including
> | the crosshatch character, is removed from the parse string before
> | continuing.
> |
> | Note that the fragment identifier is not considered part of the URL.
> | However, since it is often attached to the URL, parsers must be able
> | to recognize and set aside fragment identifiers as part of the
> | process.
> |
>
> It states clearly 'the first (left-most) crosshatch "#" and up to the
> end of the parse string is the <fragment> identifier'. This _does_ imply
> that there are more '#' characters than one ... Why say ``leftmost "#"
> character'' if there is only one allowed ? -- Mirsad

Because I believe in robust parsing. Look at the BNF (also in RFC 1808).
There is no conflict between the two, and the BNF does not allow "#"
anywhere but immediately preceding the fragment. Some would call this
weasel wording, but I call it good design. ;-)

...Roy T. Fielding
Department of Information & Computer Science (fielding@ics.uci.edu)
University of California, Irvine, CA 92717-3425 fax:+1(714)824-4056
http://www.ics.uci.edu/~fielding/