TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Subject:Re: Best tool for migrating html to xml?? From:"Eric J. Ray" <ejray -at- raycomm -dot- com> To:"TECHWR-L" <techwr-l -at- lists -dot- raycomm -dot- com> Date:Mon, 15 Jul 2002 13:12:38 -0600 (MDT)
I'd suggest using Tidy (www.w3.org/People/Raggett/tidy/)
for cleanup of HTML and/or conversion to XML or XHTML.
Then, once you're in XHTML or XML, you can use XSLT to
upconvert to more logical and coherent element names
(if you're lucky enough to have enough regularity in
your files to be able to do so).
T >>We are switching from FrontPage to Docbook and XMetaL o
create xml -> html
T >>content. What is the best way to
migrate all the existing html content into
T >>xml? We will
still save it all as html and compile it as html help in the
T >>build nightly, but need to have all the content in xml
because part of the
T >>reference documentation content will
be automatically generated from the
T >>build. Short of
copying each file into XMetaL, what is the best/quickest way
T >>to get
T >>it all in?
T >>
G >
G >Just so you know, I'm converting from Word(purpose built template) to html
G >using HDK(purpose built template), and then using Dreamweaver(purpose built
G >template)/Perl to manipulate the HDK output. From there I'll be using Perl
G >to convert the html to xml. It may not be efficient, but it's what I know
G >how to use. What I have not yet covered is DTD/Schema generation.
G >
G >Geoff
G >
G >
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Your monthly sponsorship message here reaches more than
5000 technical writers, providing 2,500,000+ monthly impressions.
Contact Eric (ejray -at- raycomm -dot- com) for details and availability.
Save $600: Create great-looking Help files and software demos with
RoboHelp Deluxe. Get RoboHelp and RoboDemo - our new demo software - for one
low price. OR Save $100 on RoboHelp Office in June with our mail-in rebate.
Go to http://www.ehelp.com/techwr-l
---
You are currently subscribed to techwr-l as: archive -at- raycomm -dot- com
To unsubscribe send a blank email to leave-techwr-l-obscured -at- lists -dot- raycomm -dot- com
Send administrative questions to ejray -at- raycomm -dot- com -dot- Visit http://www.raycomm.com/techwhirl/ for more resources and info.