TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Subject:Re: The sky is starting to fall again From:keith -at- soltys -dot- ca To:"TECHWR-L" <techwr-l -at- lists -dot- techwr-l -dot- com> Date:Wed, 27 Apr 2005 13:24:31 -0600
As far as I know, WordML is a functionally complete description of the
Word document format in XML. It's been stated on the word-pc mailing list
(by someone from Microsoft if I remember correctly), that saving a Word
file as XML and then converting back to .doc format is a good way of
removing corruption in a Word document (bad binary indexes and the like),
but you don't lose anything in the file. I have done this with fairly
complex documents and have not found anything to be lost.
MS have published the spec for WordML on the web, and it's pretty well
documented in books like Office 2003 XML. I suppose you could write a DTD
or schema if you wanted to, and you could probably write documents in
native WordML in a text editor too, but it is a very verbose format, even
for XML.
Unfortunately, WordML has limitations regarding nesting of elements and
the like which make it difficult, if not impossible, to covert a Word
document to another XML based schema, such as DocBook. I had a post on my
weblog some time ago
(http://www.soltys.ca/coredump/2004/01/word-2003-and-docbook-probably-not.html)
that links to a UseNet article which goes into detail about this.
Essentially what it boils down to is that WordML was created to preserve
the format of the document, not the structure.
Best
Keith
> eric -dot- dunn -at- ca -dot- transport -dot- bombardier -dot- com wrote:
>
>
> The author never quite gets around to the thing everyone wants to
> know: does WordProcessingML constitute the opening of the MS Word
> file format? He *does* say that saving native MS Word to
> WordProcessingML creates crazy-quilt XML, but is hopeful this will
> improve. What's not said is whether users can -- perhaps with
> difficulty -- sort-out a de facto DTD for WordProcessingML and
> create fully-featured MS Word documents with a text editor or
> programmatically, as can be done with OpenOffice files. That would
> mean universal interchange is/will be possible, but it's not clear
> to me that that's even a Microsoft goal. I suspect it's not.
>
> The test is whether there'll be a public namespace and DTD(s).
>
>
WEBWORKS FINALDRAFT - EDIT AND REVIEW, REDEFINED
Accelerate the document lifecycle with full online discussions and unique feedback-management capabilities. Unlimited, efficient reviews for Word
and FrameMaker authors. Live, online demo: http://www.webworks.com/techwr-l
---
You are currently subscribed to techwr-l as:
archiver -at- techwr-l -dot- com
To unsubscribe send a blank email to leave-techwr-l-obscured -at- lists -dot- techwr-l -dot- com
Send administrative questions to lisa -at- techwr-l -dot- com -dot- Visit http://www.techwr-l.com/techwhirl/ for more resources and info.