TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Subject:PDF saved to gibberish From:Nancy Allison <maker -at- verizon -dot- net> To:TECHWR-L -at- lists -dot- techwr-l -dot- com Date:Mon, 14 Jun 2010 09:45:38 -0500 (CDT)
Hi, all.
I have tried two ways to save the text of a PDF to .txt and both attempts produced a weird, symbol-font type gibberish.
This is what it looks like once it's pasted into Plain Text: . In the .txt file, it shows lots of male and female symbols, exclamation points, musical notes, and geometric figures.
I used the Acrobat Save as Text command, and also selected all the text and pasted it into a .txt file. Same result both times.
I selected the gibberish and assigned different fonts to it; the gibberish showed up in the selected fonts. It seems as the text has been assigned to a different character set.
The PDF document properties show a Security Method of "No Security, " Document Assembly, Comenting, Signing, and Creation of Template Pages are Not Allowed.
Everything else, including Content Copying, is allowed.
Any ideas as to what's going on, and how I can successfully extract the text?
Gain access to everything you need to create and publish documentation,
manuals, and other information through multiple channels. Choose
authoring (and import) as well as virtually any output you may need. http://www.doctohelp.com/
---
You are currently subscribed to TECHWR-L as archive -at- web -dot- techwr-l -dot- com -dot-