TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Re: How does one find repeated sentences withing a number of document
Subject:Re: How does one find repeated sentences withing a number of document From:Phil Snow Leopard <philstokes03 -at- googlemail -dot- com> To:David Harrison <dharrison -at- moldmasters -dot- com>, "techwr-l -at- lists -dot- techwr-l -dot- com (techwr-l -at- lists -dot- techwr-l -dot- com)" <techwr-l -at- lists -dot- techwr-l -dot- com> Date:Thu, 12 Jan 2012 23:35:27 +0700
In Mac OS X, we have TextWrangler.app and Grep searches to do that.
I'm not sure if there's a non-Mac equivalent, however.
On 12 Jan 2012, at 23:30, David Harrison wrote:
> We are looking a large project where we need to identify if translation memory or conversion to a write-once-use-many application (such as Flare, Author-it etc ) could pay dividends.
> My initial step is wondering how to analyse a lot of various docs to see if there is much text repetition. ( I think I would want granularity should be at sentence level). After all - if the vast majority of all the documents contain fairly unique text then TM or w-o-u-m would not really be worth considering.
> Does anyone know of any application or methodology that would enable us to do such?
> Right now we are at very first steps and we have little raw data although we do know that the main tools for document preparation appear to be MS Word and Adobe InDesign.
> Eventually - if the vast majority of all the documents contain fairly unique text then TM or w-o-u-m would not really be worth considering. We need to find which side of the fence we are.
>
> Thanks all.
> David
Create and publish documentation through multiple channels with Doc-To-Help.
Choose your authoring formats and get any output you may need. Try
Doc-To-Help, now with MS SharePoint integration, free for 30-days. http://www.doctohelp.com
---
You are currently subscribed to TECHWR-L as archive -at- web -dot- techwr-l -dot- com -dot-