ScanSoft, Inc. recently announced the ScanSoft OmniPage Search Indexer for Google Desktop Search. The beta release of the plug-in, which is available free on the Google Web site, automatically creates text-index information from PDF files and faxes, as well as scanned books and documents — making them visible to Google Desktop Search. The plug-in indexes the image text found within PDF normal, PDF image, JPEG/JPG, TIFF/TIF (FAX), BMP and PaperPort MAX file formats, and supports personal computers running Microsoft Windows XP and Windows 2000.
The OmniPage Search Indexer uses ScanSoft’s highly accurate and fast optical character recognition (OCR) and PDF conversion technology to recognize the text within image-based content, creating the index information needed by the search application. ScanSoft is the OCR behind the world’s largest book-scanning projects, and has been selected by commercial vendors delivering imaging solutions, including AnyDocs, Autodesk, Avision, Brother, Canon, Captiva, Corex, Dell, FileNET, HP, Kofax, Konica, Kyocera, Lexmark, NSI, Omtool, Verity, Visioneer and Xerox.
‘Through the addition of OmniPage Search Indexer, Google has become the first of the major search competitors to seriously address PDF, fax, and scanned documents with desktop search,’ says Ralph Gammon, editor, Document Imaging Report. ‘This is important for users in traditionally paper-intensive professions such as law, insurance, and banking. It is also critical for the rest of us, who now receive faxes through e-mail, as well as scanned documents sent as email attachments from multifunction devices and digital copiers.’
The OmniPage Search Indexer is based on technology found in ScanSoft OmniPage Pro Office 14, the company’s best-selling solution for turning paper and PDF into documents you can edit and archive. OmniPage is also used to batch convert various formats into searchable PDF archives for content management systems. ScanSoft also provides this capability to developers who wish to add imaging and PDF capabilities to their applications, via the OmniPage Capture SDK.
‘Search is clearly expanding into areas where our imaging and speech recognition technologies can play an interesting role,’ says Robert Weideman, senior vice president of marketing and product strategy for ScanSoft’s Productivity Applications Division. ‘Whether it is for scanning and indexing books, or indexing the spoken words in audio and video content, our solutions deliver the accuracy and performance needed to turn hidden content into useful information.’
Versions for Dutch, French, German, Italian, Portuguese and Spanish will be made available within 30 days. Pricing for the final release of the plug-in has not been set.
Further details are available in the OmniPage Search Indexer FAQ document.