TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Re: Can an XML authoring tool import MS Word Docs?
Subject:Re: Can an XML authoring tool import MS Word Docs? From:"Mark Baker" <mbaker -at- omnimark -dot- com> To:"Ron Rhodes" <RRhodes -at- fourthchannel -dot- com>, "TECHWR-L" <techwr-l -at- lists -dot- raycomm -dot- com> Date:Mon, 8 Nov 1999 11:38:21 -0500
Ron Rhodes wrote
>Does an XML authoring tool exist which enables a tech-writer to import Word
>documents? I downloaded test-drove xMetal and it does not have an import
>feature.
Here we enter the dreaded area of syntax vs. semantics.
Every file format has a syntax and a semantics. RTF syntax, for example
consists if groups defined by curly braces, control words, defined by a
backslash followed by a word and zero or more parameters, and text. RTF
semantics deals with what each of the control words mean: bold, section
break, table, etc.
File format conversion is easy between two file formats that have different
syntax but similar semantics. Since the Word and WordPerfect file formats
both describe printable documents, their semantics are very similar (baring
unique features) and it is fairly simple for a program to provide import
features in either direction. All it has to do is change the syntax.
But XML is a language for describing file formats. It does so by specifying
a single basic syntax that all XML languages must share. But it leaves it
entirely up to the user to determine what each of the tags in the language
they create means. In other words, it is a way of giving you custom
semantics within a standardized syntax. The advantage is that you can use a
single parser (a syntax recognizing engine) for a wide variety of different
semantics.
A program like Word is designed to let you manipulate the semantics of the
Word file format without having to worry about the syntax. An XML editor is
designed to let you manipulate XML syntax in a high level way, but it has no
knowledge of your custom semantics.
All that a generic Word import feature could do in an XML editor is
translate from Word semantics in Word syntax to Word semantics in XML
syntax. This is probably not a very useful thing to do, because Word is a
better editor for Word semantics than any XML editor, ignorant of those
semantics, could ever be.
If you want to translate existing Word documents into a custom XML based
tagging language you have developed, you will need a custom translation
program, written in a language like OmniMark, as well as some human
intervention.
If you simply want an XML version of a Word document, you can use the free
RTF to XML converter found at: http://www.xmeta.com/omlette/ .
---
Mark Baker
Senior Technical Communicator
OmniMark Technologies Corporation
1400 Blair Place
Gloucester, Ontario
Canada, K1J 9B8
Phone: 613-745-4242
Fax: 613-745-5560
Email mbaker -at- omnimark -dot- com
Web: http://www.omnimark.com