Byte offsets will not work due to
1) Tags (does byte offset include them or not)
2) Whitespace (for which SGML has lovely parsing rules ;-))
The functionality is needed very much. Perhaps we could have something
like:
<MARK TYPE=HIGHLIGHT START=123 END=333>
Where TYPE would indicate the role, and START and END represent word
offsets in the document, though this might well be more complicated
to handle than paired tags (though safer in terms of content model).
Ideas? The above does not feel good to me, and neither do PI's or
paired tags.