TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
That's a great thread from last year. Yves actually nailed it in his posts regarding using Frame and Mif2Go for the DITA conversion, of unstructured Frame and Word docs.
Although, as you say, even if you can automate the conversion process in this manner, cleanup is still required, and that depends on how structured the content is that's being converted. (Yves suggested as much as well, with a great example.)
I'm certainly learning a lot more as I go along, and this really helps.
No matter how you slice it, there are two major tasks in this whole endeavor: converting "old school" content into contemporary, structured content (task/concept/reference topic or similar, with everything that entails), and then tagging a la DITA/XML.
For me it's not as much a problem. I used DITA for a couple of years a few years ago, and have used Oxygen over a number of years maybe for a combined total of 2 years. I'm also using both now in personal work, just to stay current. (And I've used Arbortext, XMetaL, and some others.)
The consideration is really for a team of writers unfamiliar with these things: how much do you want to throw at them at once? These are two separate tasks, and both have to be taught. Writers have to learn how to structure content, and they have to learn how to tag content, or work with content that's already been tagged. (Developing and editing in an XML author/editor.)
My inquiry into structured writing and Information Mapping was an attempt to get a handle on the first task (structuring the content). My inquiry into Frame and Flare as potential intermediaries was a way to try to make the second task a little easier or more automated. So Yves suggests that this is indeed possible to a degree. (Thanks, Yves -- that post from last year helped big time.)
All right, so we're getting closer.
There's still a lot of work to be done, any way you look at it. The content auditing that Chris mentioned, in that thread from last year (and again in the recent thread), is pretty key too.
I can see analyzing our content and saying, "How much of this do we really need to provide to the user, and how much can we trim down?" Some of those decisions might be according to regulatory requirements. That will be discovered in the process.
It's complicated stuff. But I think if you look at before and after, where before is the way we've done it in the past (unstructured, "free-form" documents, as Chris calls them), and after is following the conversion to DITA or something similar, there does seem to be the potential for a magnificent cost savings. A lot of companies have done this already and many more are following suit.
It's all very interesting. I'll update as things develop significantly.
Thanks again for the great info. It's much appreciated.
Steve
From: Stuart Burnfield [mailto:slb -at- westnet -dot- com -dot- au]
Sent: Friday, October 24, 2014 3:12 AM
To: Techwr-l; Janoff, Steven
Subject: Re: Resources for learning Structured Writing?
Hi Steve. I'm coming in late to answer your original question.
Mostly I'd just me-too Mark Giffen's first post. I found myself nodding at almost every paragraph.
I would want to know, have you used a CCMS with a built-in XML editor or a third-party XML editor such as oXygen, XMetaL, or the like?
Editor: We use Arbortext but are likely to move to Oxygen in the next year or two.
For some tasks where Arbortext is weak we also use the jEdit text editor with the XML plug-in.
CMS: We use the TortoiseSVN front-end to Subversion, mostly just to save snapshots and work-in-progress. Our coders might move to Git and if they do we will too.
So you might have legacy content in any of these formats: (text; Word; unstructured Frame; ID; Flare)
Therefore five different conversion paths.
Would it make sense to use either Flare or Structured FrameMaker as your intermediary to convert unstructured content to XML/DITA and then export to your CCMS with editor?
You could pilot test some conversions to see how much time it would take or save to go via Frame or Flare to get to DITA and Oxygen (or whatever). But if this is your roadmap:
â Steve and another writer do the bulk of the conversions and become the in-house experts on DITA and Oxygen.
â Initial conversion results in manuals that 'build clean' and produce roughly comparable output to the pre-conversion sources.
â Other writers have enough competence with Oxygen to tidy up converted topics and add new topics with support from the experts.
... detouring via Frame or Flare just delays your coming to grips with DITA and Oxygen.
Personally I would get comfortable with a good text editor with regex support (such as TextPad or jEdit). This will help you to become familiar with your source documents and also with the typical tag structures in your new DITA topics.. If they're very well-structured you will be able to do more in the way of automation, otherwise expect to do a lot of manual cleanup.
I know there are companies that specialize in data conversion, but if you go the in-house route, I'm wondering if anybody on the list has had experience with this kind of thing, and what you found.
If you go with a conversion vendor, make sure you agree with them what constitutes a converted topic. The minimum would be a topic that builds cleanly (no errors or warnings) and produces complete output. Yet this could still leave you with a significant amount of work to bring it to a level that meets your quality standards and yields the benefits that you hope to get from DITA. There might be some ideal combination of paying a vendor to do the brute-force work and finishing the rest off in-house.
To give a specific example: One of the publication sets I look after was converted from SGML to DITA by a partner company who had recent experience of converting similar material. They used an in-house script to produce Mixed/Combo (that is, un-typed) topics. This made the conversion much simpler as this topic type is very forgiving in terms of validation. If the conversion tool had been told to create a DITA task for each procedure, a lot of the content would have been 'quarantined' and marked as requiring manual cleanup because it didn't fit the fairly strict structure of the <task> topic type.
Other problems:
â The converter produced generic, meaningless file names and topic IDs. Names like cexu-overview-event-collection.dita are a lot more usable than cexug291.dita, cexug292.dita, cexug293.dita, but changing them after the fact is a lot of work.
â In-line links were not converted to DITA-style 'related topics' links. These had to be done case-by-case.
My point is that old content that's been minimally 'converted to DITA' can still require a lot more work.
Finally, some more background reading: This Techwr-l thread from July last year covers similar ground and there are some good tips from Yves and others:
Cheicken and eggs scenario for structred writing http://www.techwr-l.com/archives/1307/techwhirl-1307-00142.html
You could contact the OP and see how his new DITA implementation is going.
Good luck! I find this sort of work really interesting. Let us know how you get on.
--- Stuart
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Read about how Georgia System Operation Corporation improved teamwork, communication, and efficiency using Doc-To-Help | http://bit.ly/1lRPd2l