TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
I forward this for John for the reasons he explains...
>Mark,
>
>Could you please post this for me. I am a member but the server has
trouble
>with my mail.
>
>--John Mackin
>
>-------------------------------------------------------------------
>
>Hi there! I would like to add my $0.02 worth (I apologize to our
>international participants for the use of this local idiom and an even more
>idomatic way of writing it) to the XML discussion.
>
>Mark Baker is making some very important points about understanding both
the
>scope and "type" of languages we are discussing.
>
>1. Type of languages
>Starting from the ground up, there are basically two kinds of languages:
>natural and "artificial"; but within the latter comes the key type of
>"formal languages."
>
>Natural languages find their existence only in spoken or written
>"instances." Linguists then abstract from these instances to define an
>"instance-creating language" which we then study in school under the lable
>of English, French, Japanese, or what have you. But the definitions are
>always instance-driven. Plato would say that they were one level "removed"
>from the instance. So here we have real instances and
>theoretical/abstracted definitions.
>
>Formal languages first find their existence in their formal definition;
they
>*may* then find further existence in instances. But they gain value only
by
>how much they are "instanced" (thus the low value ascribed to Esparanto).
>Here too we have definitions and instances. But there is a difference.
>Formal languages are not limited to languages that humans speak. And the
>definitions are not restricted to first-level definitions of instances. It
>is possible to conceive of definitions of definitions (second level
removed)
>or even definitions of definitions of definitions (third level removed) and
>so on.
>
>Take COBOL, FORTRAN, C++ for instance. These are first-level formal
>languages that are instanced in a "program." But other languages exist
that
>are used to define the definitions in COBOL, FORTRAN, C++, and so on.
These
>languages that define languages are second-level formal languages and are
>often called meta-languages to distinguish them from first-level languages.
>Examples are Bachus-Naur Format (BNF) and the "train-track/trolly-car)
>format often used in computer manuals.
>
>In our case, SGML/XML are second-level meta-languages. They define another
>first-level "language" (often mistakenly called SGML/XML) that is used to
>create instances/documents. HTML is an example of such a first-level
>language. In our field of documentation we may find the term "tagging
>language" suitable to describe such a first-level language (as Mark
>suggests). SGML/XML are not tagging languages: they define those tagging
>languages (in a DTD). So if we write a DTD for meeting minutes, we cannot
>call the resulting language SGML/XML. We who created it must give it a
>name, perhaps "minutes tagging language" just as HTML could be called a
>"HTTP-convention based homepage tagging language."
>
>So please, let's be clear whether we are describing a first-level or
>second-level language.
>
>2. Scope of language
>
>In documentation we want to control five aspects of our documents: content,
>structure, formatting, interrelationships, (sometimes) version/level
control.
>
>The scope of SGML/XML is restricted to defining the structure of documents
>(and assigning some arbitrary "attributes"). The semantics of the tags and
>the attributes is outside the scope of the SGML/XML languages. This
>restriction adds to the flexibility and power of the two languages, but
also
>tends to confuse people because they naturally add meaning to
>character-strings that match "words" in their language. BUT SUCH SEMANTICS
>ARE NOT PART OF SGML/XML.
>
>The designers of a DTD may arbitrarily assign "meaning" to the tags and
>attributes but those meanings will not be understood by a program that
>checks whether a given instance complies with the DTD (such an error
checker
>is called a "parser.") Those meanings can only be realized by "application
>programs" that come after the parse phase. And only those who write the
DTD
>or want to use that DTD can write the application, because of the
>arbitrarily assigned "meanings." What are examples of such "meanings"?
>Formatting information such as placement, spacing, and size of font;
>content-type information such as "warning" or "error messages."
>
>HTML is a first-level tagging language with arbitrarily assigned
"meanings."
>All of the meanings are "format meanings," not "content meanings" except
for
>the two sections HEAD and BODY. But the meanings are not what the humans
>think they are. <P> does not mean "paragragh." It means "new line + skip
>one line + go to left margin" (for all the application programs/browsers)
>that I have seen). <H3> does not mean "header, level-3." It means "change
>font to font specified for <H3>." Homepage designers know that the
apparent
>meaning is not the real meaning and use this knowledge to create all those
>"cute" homepages that would make a TW person sick if he or she would read
>the real coding.
>
>To summarize, HTML is a tagging language that defines text and graphics and
>interrelationships, with little structure, no content. It has formatting
>conventions implemented (sometimes differently) in a number of popular
>application programs (browsers). There are certain documents that beg for
>such functionality, while others would be severly contrained by it. In
>other words, for certain circumstances, HTML and a browser are all you need
>to provide users with documents.
>
>SGML needs other standard languages to create a totally "international
>standard" environment. (You don't have to buy this concept of international
>standards. You can lock yourself into a proprietary standard. But that is
>another topic.) SGML is grouped with DSSSL for formatting and with HyTime
>for flexible linking of all kinds, No international standards presently
>exist for content control and version/level control.
>
>When SGML was invented, the Internet did not exist. Documents were not
>updated or viewed in realtime. So SGML was designed for a batch
>environment, not a realtime environment.
>
>XML is a variation of SGML (a few rules were eliminated and a few were
>changed) designed to work in a realtime environment. It is grouped with a
>different set of standards (W3C, not ISO). XLL is proposed for linking (it
>is a variation of HyTime) and XSL for formatting (and probably action
>scripting). No standards are planned for content control or version/level
>control.
>
>If you keep in mind the differences in type and scope of languages, the XML
>discussion will proceed faster and more accurately.
>
>--John Mackin
>
>
>
>
>
>
>
>
>
>
>
>Scope of languages
>
>
>
>At 11:32 98/04/15 -0400, you wrote:
>> Deborah Ray wrote:
>>
>> >If you're wondering about whether browsers will end up
>> >offering competing XML extensions (or some other makes-
>> >it-hard-to-develop-usable-pages garbage)...probably
>> >not. Because XML is completely customizable, you (not
>> >the browser companies) are in charge of what features
>> >are available.
>>
>> A word of clarification on extensibility. XML is not a tagging language,
it
>> is a language for defining tagging languages. XML itself is not user
>> extensible. Only the XML working group can extend XML.
>>
>> Tagging languages written in XML are extensible. All tagging languages
are
>> extensible. It is a myth that HTML is not extensible. The browser wars
were
>> all about extensions to HTML. HTML is now at or approaching version
4.0 --
>> impossible if it was not extensible.
>>
>> The point is, tagging languages are extensible by the people who create
>> them. More specifically, they are extensible by the people who write the
>> applications that process them (browsers, for example). (HTML 3.0 was
>> proclaimed by W3C, but it was a dud because the browsers did not support
it.
>> HTML 3.2 was a reduced language the defined the tagging that the browsers
>> actually supported. The processing application is king.) So yes you can
>> customize your XML, because you write your XML based tagging language,
and
>> your own processing applications.
>> ---
>> Mark Baker
>> Manager, Corporate Communications
>> OmniMark Technologies Corporation
>> 1400 Blair Place
>> Gloucester, Ontario
>> Canada, K1J 9B8
>> Phone: 613-745-4242
>> Fax: 613-745-5560
>> Email mbaker -at- omnimark -dot- com
>> Web: http://www.omnimark.com
>>
>>
>>
>>
>>
>>
>***************************************************
>* John Mackin, Fujitsu Learning Media, Limited *
>* <CALS, Technical Communication, Translation> *
>* jmackin -at- flm -dot- se -dot- fujitsu -dot- co -dot- jp *
>* TEL:+81-3-5762-8086 FAX:+81-3-5762-8074 *
>***************************************************
>
>