TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Subject:Re: HTML to XML conversion From:Sandy Harris <sandy -at- storm -dot- ca> To:"TECHWR-L" <techwr-l -at- lists -dot- raycomm -dot- com> Date:Tue, 15 Jan 2002 11:16:04 -0500
"Sandrine Touzé" wrote:
>
> I want to import our existing HTML documentation in order to generate online
> helps and printed manuals.
You can do that without XML. I do, for my docs at www.freeswan.org/doc.html
The tool I use to go from a collection of HTML source files to:
HTML with prev/contents/next tags + separate linked HTML TOC
HTML as one big file with TOC
PDF
Postscript
is htmldoc, free from www.easysw.com, though they do charge for support.
I use it on both Linux and Windows. As I recall. it runs in some other
environments too.
That said, using XML is probably a better way to do this. I oftem think
about switching myself, but so far haven't gotten around to it.
> I guess my question should actually rather be: what are the pre-requisites
> to switch from an HTML to an XML documentation? Shall I first create a DTD
I wouldn't suggest creating your own DTD unless you /both/ have enough XML
experience to be confident you'll do it well /and/ have some unusual
requirements that rule out using a standard DTD.
I'd look at the DocBook DTD first. www.docbook.org Likely the O'Reilly book
on it would be worth your while.
Look at www.linuxdoc.org for many examples. Most of the Linux Documentation
Project docs use DocBook. Their authors' guide may also be helpful.
> and an XSL stylesheet and then import the files? What is the procedure?
> Has anyone ever gone through these kind of steps?
As someone mentioned, the HTML Tidy program (download from w3c.org) will
do batch conversions. The same site has a free browser/editor called
Amaya that lets you do "save as XHTML". Of course, that only gets you to
XHTML, not to whatever other DTD you might want.
I have some sed scripts that do parts of an XHTML-DocBook conversion.
They are not finished. Mail me off list if you want them.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Collect Royalties, Not Rejection Letters! Tell us your rejection story when you
submit your manuscript to iUniverse Nov. 6 -Dec. 15 and get five free copies of
your book. What are you waiting for? http://www.iuniverse.com/media/techwr
---
You are currently subscribed to techwr-l as: archive -at- raycomm -dot- com
To unsubscribe send a blank email to leave-techwr-l-obscured -at- lists -dot- raycomm -dot- com
Send administrative questions to ejray -at- raycomm -dot- com -dot- Visit http://www.raycomm.com/techwhirl/ for more resources and info.