Re: HTML parser in Yacc form???

Gavin Nicol (gtn@ebt.com)
Fri, 24 Mar 1995 12:30:27 +0500


>Parsing SGML with a top down recursive decent parser based on an FSR is
>by far the simplest approach to implement and also produces correct code.
>Why would anyone want to use an inappropriate tool which does the job less
>well and is more difficult to use?

True enough. I was just pointing out that it's not impossible, and in
particular, recommending the TEI subset.

>Yacc is OK if you actually have an LR(1) grammar. But its best to
>steer well clear of it otherwise. In addition error handling was
>never really though out properly for yacc. I've never seen anyone
>sucessfully use the error productions without comming a cropper.

Quite! yacc error productions fall into the "black art" category at
the very least.

>I think the problem lies in comp sci classes being taught that bottom
>up parsing is `better' and the students not asking why. Goldfarb
>would not know an LR(1) grammar if one bit him on the nose. If he had
>SGML might not fall into the "much wailing and gnashing of teeth"
>catogory which it does.

Well, the other thing is that many people perceive writing a recursive
descent parser to be harder than writing a YACC grammar
description. I'm not sure this is true. I'll be perfectly honest and
say that over the last 10 years, less than a third of the parsers I've
written used YACC (though flex is a godsend, at least for
prototyping). I saw one interesting parser concept which used "event
handlers" to create a very loosely coupled FSM. Quite interesting, and
very fast too.

>PS: I have discovered that the correct pronunciation of "ASN.1" is
>"assasin 1".

Or perhaps asinine ;-)