TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Subject:How to make a gigantic doc set easily searchable? From:Geoff Hart <ghart -at- videotron -dot- ca> To:TECHWR-L <techwr-l -at- lists -dot- techwr-l -dot- com>, Samantha Slate <samantha -dot- slate -at- gmail -dot- com> Date:Sat, 23 Sep 2006 08:53:44 -0400
Samantha Slate wondered: <<Much of our documentation is intended to
help in-house consultants and tech support, as well as the occasional
tech-savvy customer, with the very complicated task of installing and
configuring our products. Currently, we publish all of our docs as
PDFs.>>
PDFs designed for onscreen use, or PDFs designed for printing that
your audience is forced to use onscreen? I say this not to be snide,
but rather to reinforce the difference. It's important and
underappreciated.
<<However, we've been getting complaints recently that the doc set is
not easy to navigate. We are looking at reorganizing the books in
ways that will seem more intuitive to users, but even so, we would
like to be able to offer some sort of advanced-search capabilities
that lets the user search more than one book at a time, using more
than one search term (unlike PDF, which only lets you search for one
text string at a time).>>
Make sure you understand specifically what they are saying is
difficult. I've seen many people rush off all full of energy to solve
the wrong problem. In terms of the search engine, I'm not sure I can
help you (not even remotely a PDF expert), but my take on this is
that if readers need to use the search function, then your table of
contents and index are failures.
Admittedly, this is an exaggerated statement, but I've made the
statement that extreme specifically to make a point: search tools
must complement, not replace, effective navigation.
<<The obvious thing (to me) that comes to mind is a password-
protected web site.>>
Given that your main audience is in-house staff, password protection
isn't at all useful. Even for your customers, why would you bother
protecting information that they have freely available in PDF? Worst-
case scenario is that if someone wants to steal the information,
they'll run it through OCR software--in fact, there's at least one
program designed specifically to extract text from PDFs into (say)
Word, and if a client has Adobe's Creative Suite, they can directly
import PDFs.
"Information wants to be free; writers want to be paid." (I believe
Bruce Sterling or possibly Cory Doctorow originally coined the
phrase.) The important thing here is that your documentation is
useless to anyone who doesn't already own your software. So who cares
if someone steals it? Free up the information, and make them buy the
product. (Think of this as public-key cryptography for documentation:
you need both keys to do anything useful. <g>)
<<But is it possible to single-source our documentation so that it
can be used both as the content of such a Web site *and* in a PDF-
publishable format, since some customers do still want the PDFs?>>
It's easy enough if you plan to do so right from the start. This can
be as simple as adopting a continuous, one-column design. For
example, newspaper-style layouts in which stories break and
interweave across multiple pages can be very difficult to "unroll"
into a series of Web pages, whereas a linear text flow with each
level-1 heading denoting a new page (in print or on the screen) is
relatively easy to turn into HTML. The process can also be more
complicated, involving SGML/XML and single-sourcing, which lets you
produce complex print layouts that turn into simple and elegant Web
pages.
The details of SGML/XML single-sourcing lie well outside my area of
expertise, but there are experts here you can ask. Write back to
techwr-l with specific details of your software and you'll get some
more specific advice.
<<Also, have any of you ever been asked to enhance search
capabilities for your documentation? If so, how did you respond to
the request?>>
Since I spent a significant amount of time creating kick-ass tables
of contents and indexes, the question never arose. <g> But we have
several PDF experts here who can help.
WebWorks ePublisher Pro for Word features support for every major Help
format plus PDF, HTML and more. Flexible, precise, and efficient content
delivery. Try it today! http://www.webworks.com/techwr-l
Easily create HTML or Microsoft Word content and convert to any popular Help file format or printed documentation. Learn more at http://www.DocToHelp.com/TechwrlList