Re: An Anchor attribute question:

Michael Mealling (ccoprmm@oit.gatech.edu)
Thu, 2 Jun 1994 12:39:27 -0400 (EDT)


Daniel W. Connolly said this:
> In message <199406021522.AA19445@oit.gatech.edu>, Michael Mealling writes:
> >Daniel W. Connolly said this:
> >> Actually, now that I think about it, If you're not going to include
> >> a redundant URL, why don't you just write:
> >>
> >> <A HREF="URN:IANA:IETF:rfc/822"> ...</a>
> >>
> >> ???
> >
> >This would work also. I would like to be able to make this distinction
> >in HTML though. Simply to keep in the spirit. There also seems to be
> >something in HTParse.c that is causing that example URN to be invalid
> >since HTParse.c: scan() function makes the assumption (which may be
> >a correct one according to the current URL spec) that no other colon
> >should exist beyond the first one. This is causing HTParse() to turn
> >the above into "URN:rfc/822" by basically looking at the first colon
> >READING BACKWARDS.
> >
> >Is this correct?
>
> Well... it depends on how you want to look at it. The URI working
> group's definition of URL is
> scheme:anything
>
> The WWW definition of URI (the contents of the HREF attribute) is:
> scheme://hostport/dir1/dir2;param=value?search#fragment
> where all the parts are optional, but only certain combinations
> make sense. (See
> http://info.cern.ch/hypertext/WWW/Addressing/URL/URI_Overview.html
> for details)
>
> So any WWW URI is an IETF URL, but the converse isn't true.
> HTParse.c assumes you're handing it a URI.
>
> Now if you define the syntax of URN to be:
> URN:anything
> then any URN is a URL, but it's not a URI.
>
> It would make more sense to me to define the syntax of URNs
> such thaty they are also URIs. So in stead of:
>
> HREF="URN:IANA:IETF:rfc/822"
> you would write:
> HREF="URN://IETF.IANA/rfc/822"
>
>
> It's just an expedient measure to hasten deployment. The syntaxes
> have equivalent expressive power.

Ok, I've added a couple of lines to HTParse.c that fix this and a few
other things that the current URL spec breaks:

in scan() I added these two lines just before the line
after_access = name;:

if(!strncmp(name,"URL:",4))
name=name+4;

This takes care of the current URL spec that requires URL: in front of
a URL. Normal WWW URLs still work normally.

Next, in that first for loop that scans for scheme I added a 'break;'
as illustrated:

for(p=name; *p; p++) {
if (*p==':') {
*p = 0;
parts->access = name; /* Access name has been specified */
after_access = p+1;
!!!!here-----> break; <-------here!!!!
}
if (*p=='/') break;
if (*p=='#') break;
}

This fixes the apparent small bug that causes URN:bla:bla: to get fouled up.
Everything else seems to work normally.

Can anyone see anything wrong with these two changes?

-MM

-- 
------------------------------------------------------------------------------
<HR><A HREF="http://www.gatech.edu/michael.html">
<ADDRESS>Michael Mealling</ADDRESS>
<ADDRESS>michael.mealling@oit.gatech.edu</ADDRESS></A>