Re: Filters for HTML?

Tim Berners-Lee (timbl)
Thu, 1 Apr 93 01:45:54 MET DST


Yes, we have very crude filters for converting clean SGML to TeX
-- just 'sed' files. They will take the output of the NextStep
WorldWideWeb.app becaus it puts line breaks in and so sed
can handle it.

If you want to make a converter which parses the HTML properly,
you could take the line mode client version 2.0, and
in the library just hack the HTML regeneration module
HTMLGen.{c,h} to produce TeX instead of HTML. The module
is driven by a stream of text and element stop/start by
element number, so it is just a set of tables of strings.

If you are interested in our mapping, ask Arthur Secret
<secret@dxcern.cern.ch> to mail you our latest sed files.
We in fact made one new latex macro for the paper docs
we push out, in order to do a better job of DL lists.

The basic sed files for making article style latex are
on the web ... look under "tools for information providers".

Tim
From janssen@parc.xerox.com Thu Apr 1 00:56:53 1993
Return-Path: <janssen@parc.xerox.com>
Received: from dxmint.cern.ch by nxoc01.cern.ch (NeXT-1.0 (From Sendmail 5.52)/NeXT-2.0)
id AA21635; Thu, 1 Apr 93 00:56:39 MET DST
Received: from alpha.Xerox.COM by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
id AA21261; Thu, 1 Apr 1993 01:15:29 +0200
Received: from holmes.parc.xerox.com ([13.1.100.162]) by alpha.xerox.com with SMTP id <11942>; Wed, 31 Mar 1993 15:15:08 PST
Received: by holmes.parc.xerox.com id <16134>; Wed, 31 Mar 1993 15:15:00 -0800
Received: from Messages.7.15.N.CUILIB.3.45.SNAP.NOT.LINKED.holmes.parc.xerox.com.sun4.41
via MS.5.6.holmes.parc.xerox.com.sun4_41;
Wed, 31 Mar 1993 15:14:54 -0800 (PST)
Message-Id: <ofiWLioB0KGWFC3=Zz@holmes.parc.xerox.com>
Date: Wed, 31 Mar 1993 15:14:54 PST
Sender: Bill Janssen <janssen@parc.xerox.com>
From: Bill Janssen <janssen@parc.xerox.com>
To: www-talk@nxoc01.cern.ch
Subject: Filters for HTML?
In-Reply-To:
References:
Status: O

Does anyone have filters that will convert HTML to TeX? Or TROFF? Or
PostScript? or anything...

Bill