TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Subject:Editing tips from the National Security Agency From:arroxaneullman -at- aol -dot- com To:techwr-l -at- lists -dot- techwr-l -dot- com Date:Wed, 25 Jan 2006 13:32:14 -0500
TechWhirlers,
The below news article seemed like it might be relevant to tech comm.
It's definitely a good warning for those concerned about sensitive
material.
Hiding confidential information with black marks works on printed
copy, but not with electronic documents, the National Security Agency
has warned government officials.
The agency makes the point in a guidance paper on editing documents
for release, published last month following several embarrassing
incidents in which sensitive data was unintentionally included in
computer documents and exposed. The 13-page paper (click here for PDF)
is called: "Redacting with confidence: How to safely publish sanitized
reports converted from Word to PDF."
Instead of covering up digital text with black boxes, it is better to
delete any information you don't want to share, the NSA suggested.
"The key concept for understanding the issues that lead
to...inadvertent exposure is that information hidden or covered in a
computer document can almost always be recovered," the NSA wrote in the
Information Assurance Division paper, dated Dec. 13 but only recently
posted to the Web. "The way to avoid exposure is to ensure that
sensitive information is not just visually hidden or made illegible,
but is actually removed."
BOX----------
Three common mistakes
There are a number of pitfalls for people trying to amend a sensitive
Word document for public release as a PDF. Here is the NSA's advice on
typical traps.
Redaction of text and diagrams
Covering text, charts, tables or diagrams with black rectangles, or
highlighting text in black...is not effective, in general, for computer
documents distributed across computer networks (i.e. in "softcopy"
format). The most common mistake is covering text with black.
Redaction of images
Covering up parts of an image with separate graphics such as black
rectangles, or making images "unreadable" by reducing their size, has
also been used for redaction of hardcopy printed materials. It is
generally not effective for computer documents distributed in softcopy
form.
Metadata and document properties
In addition to the visible content of a document, most office tools,
such as (Microsoft) Word, contain substantial hidden information about
the document. This information is often as sensitive as the original
document, and its presence in downgraded or sanitized documents has
historically led to compromise.
Source: NSA Information Assurance Division report
BOX----------
The unintended disclosure of metadata, resulting in high-profile leaks
of secrets, has led to red faces at businesses and government bodies in
the past. In March 2004, a gaffe by the SCO Group revealed which
companies it had considered targeting in its legal campaign against
Linux users.
More recently, pharmaceutical giant Merck was put in the hot seat
because of changes made to a document regarding the painkiller Vioxx.
There have also been document data leaks at the White House, the
Pentagon, the United Nations and others, according to compiled research
from Workshare, a maker of software that strips tell-tale hidden data
out of files.
There have been so many stumbles that the NSA document should be
welcome help, said Pete Lindstrom, an analyst with Spire Security in
Malvern, Pa.
"It ends up being a really big exercise in public humility because it
is an embarrassing issue," he said. "It affects governments more than
anyone else."
Cleaning up
Government analysts make three main missteps that will jeopardize
confidentiality when sanitizing documents, according to the NSA report.
"The most common mistake is covering text with black," the agency said.
While this works for printed material, "it is not effective, in
general, for computer documents."
The second top goof is similar: In this case, workers cover up
graphics and other images with new graphics, such as a black rectangle.
As with blacked-out text, a recipient of the document can often delete
the coverings and see the information that is intended to be hidden.
The third gaffe is failure to remove information about the document,
such as change history, author name and creation dates, known as
metadata.
To avoid such blunders, the NSA paper gives step-by-step instructions
on how to strip a Microsoft Word document of confidential information
and then convert it an Adobe Systems PDF file. The advice deals with
text passages and images in the document, as well as with metadata.
Both the Word and Adobe PDF formats can contain many kinds of
information--such as text, graphics, tables, images and metadata--all
mixed together. "The complexity makes them potential vehicles for
exposing information unintentionally, especially when downgrading or
sanitizing classified materials," the NSA said.
Microsoft Word is used throughout the Department of Defense and the
intelligence community, while Adobe PDF is used "very extensively" by
all parts of the U.S. government and military services, the agency
said. It noted that government bodies often distribute cleaned-up
documents in PDF format, and cautioned: "As numerous people have
learned to their chagrin, merely converting an MS Word document to PDF
does not remove all metadata automatically."
Metadata methodology
Metadata could become an increasing problem in the future, Gartner
analysts warned recently. Vista, the next version of the Microsoft
Windows operating system, will let people tag files with metadata to
improve search capabilities, Microsoft has said. But those tags could
lead to unwanted disclosure of information, Gartner analysts said.
Microsoft provides some tools to remove metadata in its Office
applications and built into Word 2003 a feature to remove personal
information. However, these do not remove sensitive data from the main
document, nor do they remove all metadata of possible concern, the NSA
said.
Adobe supports the agency's guidance for proper editing techniques and
is developing additional documentation for other customers, John
Landwehr, director of security solutions and strategy for the San Jose,
Calif., technology company, said in a statement via e-mail.
"As the NSA points out, it's very important to actually remove the
redacted content from an electronic document--not just leave the data
in a document and attempt to graphically cover it," he said.
Following the guidelines will effectively clean a document, said Joe
Fantuzzi, chief executive of San Francisco-based Workshare, but could
be challenging for the less tech-savvy.
"They are way too complicated. It is going to take too long for people
to do the right thing, and people are going to continue to make
mistakes," he said.
Meanwhile, the NSA paper itself contains a bit of metadata. According
to its cover the paper was created on Dec. 13, 2005. The properties of
the Adobe PDF file, however, state the document was created on Jan. 10,
2006.
Now Shipping -- WebWorks ePublisher Pro for Word! Easily create online
Help. And online anything else. Redesigned interface with a new
project-based workflow. Try it today! http://www.webworks.com/techwr-l