TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Subject:Re: Convert PDF files to something WORD readable? From:Laurence Burrows <burrows -at- IBM -dot- NET> Date:Wed, 30 Dec 1998 16:16:46 +1100
Hi all:
Mark Forseth said:
----------snip
Has anyone tried Frame 5.5.6? Promo states:
"Place pages from PDF documents into FrameMaker documents."
----------snip
Answer -- one PDF page --> one [anchored] epsf image on a page -- great for
illustrations, a page from a brochure, Excel graphs (see my previous post),
etc. but not for transporting text.
However ----
I picked this up from a local magazine 'Desktop'. The Gemini product is an
Acrobat Exchange plug-in. Their web site is http://www.iceni.com
----------snip
Iceni has released a major upgrade to Gemini which greatly expands the
range of PDF conversion options available. This new release provides even
more advanced features which make this software the world's leading PDF
conversion plug-in. No other software in the world provides the following
features:
. Tables into HTML, SYLK, RTF or Tabbed Text
. Automatic detection of vector graphics with conversion to web ready
JPEG, or TIFF
. Images into JPEG, Progressive JPEG or TIFF
. Generation of HTML 3, HTML 4, RTF or ACSII
. Automatic image linking in HTML output
. Recognition and retention of typographic details for different fonts,
sizes, colours and styles
. Preservation of the correct reading order across multiple columns of text
. Intelligent association of images with their image captions
and...
The full list of output formats is:
Plain Text
HTML V3 - single column, with automatic image links
HTML V3 - tabular for retaining layout, with automatic image links
HTML V4 - single column, with automatic image links
HTML V4 - relative font sizes, with automatic image links
HTML V4 - tabular for retaining layout, with automatic image links
RTF
The full list of table output formats is:
Tabbed Text
HTML 3
HTML 4
HTML 4 Relative
SYLK
RTF
The full list of image and vector graphics output formats, from 72dpi to
800dpi, is:
JPEG
Progressive JPEG
TIFF
TIFF - RLE
The software recognises font sizes, colours and type styles to faithfully
reproduce the style of text in the output format. Sophisticated additional
recognition features help to ensure that the conversion process generates
clear and readable output - paragraphs, word hyphenation, rotated pages,
drop-caps, over-kerning and double spaced lines all receive special treatment.
The commercial version of the plug-in also includes a full English
dictionary with more than 24,000 words to further improve accuracy when
removing unwanted hyphenation from words broken across lines. Non-English
languages will also benefit from the improved de-hyphenation logic within
the plug-in.
The Plug-In supports PDF's written in most Western Languages.
--------snip