Monday, June 16, 2008

Correcting Paper Capture boo-boos


Although the OCR (Optical Character Recognition) software used by Paper Capture has become better and better over the years, it’s still far from perfect. After processing a scanned PDF document using the Formatted Text & Graphics output style, you need to check your processed document for words that Paper Capture didn’t recognize and therefore wasn’t able to convert from bitmapped graphics into text characters.
To make this check and correct these OCR errors, follow these steps:
  1. Choose Document➪Paper Capture➪Find First OCR Suspect. The program flags the first unrecognized word in the text by putting a gray rectangle around it and opens the Find Element dialog box. Acrobat shows a magnified view of the unrecognized word in the Find Element dialog box,
  2. Choose the TouchUp Text tool by clicking its button on the Advanced Editing toolbar.
  3. In the Find Element dialog box, choose one of the following options:
    • To accept the word displayed and convert it from a graphic into text and then continue to the next capture suspect, click the Accept and Find button.
    • To edit the suspect word directly in the Find Element dialog box, type over incorrect characters in the suspect word and then click the Accept and Find button and go to the next suspect.
    • To ignore an unrecognized word and not convert it to text, just click the Find Next button to move right on to the next suspect.
  4. Repeat Step 3 until you’ve checked and corrected all the unrecognized words in the processed document. Note that if you choose Document➪Paper Capture➪Find All OCR Suspects, the program finds and highlights all suspect elements in the document without opening the Find Element dialog box. This allows you to individually choose which OCR suspect you’d like to edit.
  5. To edit one of the OCR Suspects in a document after choosing Find All OCR Suspects command, make sure the TouchUp Text tool is selected and double-click the desired element to open the Find Element dialog box. The selected OCR Suspect appears in the Find Element dialog box. You can continue by repeating Step 3 or close the Find Element dialog box and repeat Step 5.
  6. Click the Close button in the lower-right corner of the Find Element dialog box to close it, and then choose File➪Save to save your corrections to the PDF document.

0 comments: