TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Subject:RE: Converting to text From:Jason Willebeek-LeMair <jlemair -at- cisco -dot- com> To:Mark Baker <mbaker -at- omnimark -dot- com>, Jason Willebeek-LeMair <jlemair -at- cisco -dot- com>, TECHWR-L <techwr-l -at- lists -dot- raycomm -dot- com> Date:Tue, 22 Feb 2000 11:46:13 -0600
<snippety/>
>You can't do RTF to XML (or anything else to XML) with XSLT.
Hmmm. Then this article about converting HTML to XML using XSLT
(microsoft or xml.com site, I do not remember which) must be wrong...
For converting legacy documentation, Omnimark may be the ticket, or it
may not. For HTML, it is probably a little much (at least, according to
this article). For Framemaker docs, you should look into the XML export
function of Frame. For Word, it sounds like Omnimark fits the bill.
>XSLT has many weaknesses, and no advantages over Perl,
>Python, OmniMark, Java, or just about any other full
>programming language that either has a parser built in
>(as does OmniMark) or can interface to a parser.
Again, not exactly true.
One advantage is that you do not have to learn another &%^#ing
programming/scripting language.
As an XML newbie, it took me exactly one day to write several XSLTs that
converted the same source to RTF, HTML, and MML. Not bad, considering
that I also had to create the XML markup and figure out the target
markup languages. There are still a few bugs to work out of the RTF and
MML--mostly to do with tables (and with the fact that my knowledge of
those two formats comes from saving existing files to those formats and
viewing the output in Notepad)--but the HTML is fully formatted.