TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Subject:RE: Converting Word files into XML From:"Ole Andersen" <ora -at- dita-exchange -dot- com> To:"Kevin McGowan" <thatguy_80 -at- hotmail -dot- com>, <techwr-l -at- lists -dot- techwr-l -dot- com> Date:Fri, 30 May 2008 06:26:33 +0200
Kevin,
I agree it would be nice to have a little "mapping-tool" to smoothly migrate Word to DITA. But isn't converting data (if I understand you correctly) not just a first tiny step?
CHALLENGES?
What about the fact that you are converting monolithic documents to self-contained DITA topics (chunks)? In Word you will have all kind of references (see table above, see section XY on page ZX, hyperlinks and bookmarks from one chapter to another etc.) and you will have to deal with the fact that writing self-contained topics probably requires some training?
Apart from that you probably agree that not two random Word-authors have used the same set of Word Styles and this means that the "mapping tool" probably has to be very flexible?
Last but not least; consistency. Companies often claim that they use the Word styles in a consistent way across their document. But once you dig into the detail you will probably find that this is not entirely true?
BUT having said that, I still agree with you. It must be possible to create a tool like this to cover maybe 50% of the task and then "clean-up" the remaining 50% once the conversion is completed.
THE WAY WE DO IT...
When we are facing a challenge like this, the overall obstacle is to convert monolithic document to self-contained topics. This is in fact a manual (or semi-manual) task that will require some human interference.
But we are not suggesting to tag line-by-line.
In our solution you will have the browser (the DITA Exchange topic editor) open in one window and the Word file open in a second window. Then you cut/paste from one window to the other. The tagging will BTW not be visible (the tagging is done automatically) so the challenge is to cut from a Word document and to paste in a forms-based browser interface.
This means BTW that the migration/conversion process is not limited to XML-experts - subject matter expert can help in the process once they have had a few hours of guidance. So "grab" somewhat large chunks of Word content and paste them into the relevant field in the DITA Editor is currently our "best practice".
After that we extract the TOC from the Word document and compose the DITA map - but that's probably the easy task?
I'm not sure if you can use my input, but now I passed on the way we currently do it :-)
Thanks,
Ole
Best Regards
>< Content Technologies ApS
Ole Rom Andersen
Director, Co-founder
Harevej 23
DK-8660 Skanderborg
Mobile: +45-4044-0553
Phone: +45-3696-0899
Skype: olerom
ora -at- dita-exchange -dot- com http://www.dita-exchange.com
-----Original Message-----
From: techwr-l-bounces+ora=dita-exchange -dot- com -at- lists -dot- techwr-l -dot- com [mailto:techwr-l-bounces+ora=dita-exchange -dot- com -at- lists -dot- techwr-l -dot- com] On Behalf Of Kevin McGowan
Sent: 29. maj 2008 17:39
To: techwr-l -at- lists -dot- techwr-l -dot- com
Subject: Converting Word files into XML
Hi all,
Recently started another new contract, and will most likely be moving some existing, thankfully small, documents from Word format into DITA - XML via FrameMaker or XMetal.
Thing is, I just chatted with a couple of guys here who's exclusive job it is to take GIANT Word file (they could range from 50-1200 pages) and convert them into XML (not DITA, but some other DTD). I just got a tour of what they do, and they literally go through line-by-line in Dreamweaver, assigning tags as they go.
Has no one yet developed an amazing little tool that could map Word styles into XML tags to provide clean output? There's gotta be a faster way to do this, isn't there?
Cheers,
Kevin
_________________________________________________________________
If you like crossword puzzles, then you'll love Flexicon, a game which combines four overlapping crossword puzzles into one! http://g.msn.ca/ca55/208
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Create HTML or Microsoft Word content and convert to Help file formats or
printed documentation. Features include support for Windows Vista & 2007
Microsoft Office, team authoring, plus more. http://www.DocToHelp.com/TechwrlList
True single source, conditional content, PDF export, modular help.
Help & Manual is the most powerful authoring tool for technical
documentation. Boost your productivity! http://www.helpandmanual.com
---
You are currently subscribed to TECHWR-L as ora -at- dita-exchange -dot- com -dot-
Create HTML or Microsoft Word content and convert to Help file formats or
printed documentation. Features include support for Windows Vista & 2007
Microsoft Office, team authoring, plus more. http://www.DocToHelp.com/TechwrlList
True single source, conditional content, PDF export, modular help.
Help & Manual is the most powerful authoring tool for technical
documentation. Boost your productivity! http://www.helpandmanual.com
---
You are currently subscribed to TECHWR-L as archive -at- web -dot- techwr-l -dot- com -dot-