Making Good Use of Adobe Acrobat PDF Files

This topic gives detailed information about the PDF format supported by Readiris and ways in which you can make good use of the PDF files.

File formats

Note: compression is used for all elements. Black-and-white images are Group 4 compressed TIFF files, greyscale and color images are JPEG files (with (0.8) high quality). The text is compressed using the Gzip mode.

Advantages

Editing the recognized text

The recognized text can obviously be edited and re-used. (Bitmap images can be viewed but not edited.)

Use the TouchUp Text tool of the Acrobat software to correct small recognition errors in the PDF file.

Tip: it takes the appropriate version of Acrobat (Reader) to correctly display the resulting PDF files! To view and print Central-European texts (such as Czech and Polish), Baltic texts, Turkish and Cyrillic (ôRussianö) texts in the PDF format, you must have the special ôCEö version (Central-European) of the Acrobat (Reader). (You can find this software on the Readiris CD-ROM.)

Exporting text to other applications

Intelligent searching

Use the Find command of your Acrobat (Reader) software for simple searches within a document, use the Search command for advanced searching across several PDF documents.

Warning: not all versions of the Adobe Acrobat Reader software include the Search function!

Searching for words

The button Find of the Adobe Acrobat (Reader) software finds complete words or word parts in the current PDF document. Acrobat looks for the word by sequentially reading every word on every page in the file.

Searching on indexes

The button Search of the Adobe Acrobat (Reader) software allows you to perform advanced and fast searching on a collection of indexed PDF documents.

Index-based searching implies that the ôfull-textö index was created for a collection of PDF files with the command Catalog. (A ôfull-textö index is an alphabetized list of every word used in a document or a series of documents. Index-based searching is much faster than the Find command: Acrobat goes right to the word in the list rather than progressively reading through the documents.)