Converting PDF Files

If you are in the High School, go to the Copy Center and read the instructions on the PC. This is a really simple way of converting PDFs to text.

Images

If you have a PDF file, a poster say, and you want to email it, you have two options.

Send it as an attachment OR

Convert the PDF to a JPEG file and embed it in the body of the email. This is useful if your recipient(s) are technophobes and have trouble opening attachments. In Adobe Acrobat with the PDF open go to FILE, SAVE AS and save the file in the Format of JPEG. Then drag the saved jpeg file into the body of your email.

Text

Converting scanned pages to searchable text

If you want to be able to search, correct, and copy the text in a scanned Adobe PDF file, you can "capture" the pages in one of three file formats: Adobe PDF Formatted Text and Graphics, PDF Searchable Image (Exact), and Searchable Image (Compact). All formats apply optical character recognition (OCR) and font and page recognition to the text images and convert them to normal text. The searchable image file types have a bitmap image of the pages in the foreground and the captured text on an invisible layer beneath.

To convert scanned pages to searchable text:

Adobe Acrobat 8

1. Open the scanned PDF.
2. .Choose Document > OCR Text Recognition > Recognize Text Using OCR
3. In the Recognize Text dialog box, select an option under Pages.
4. (Optional) Click Edit to open the Recognize Text - Settings dialog box, and select the options you want to use.

Previous versions

1. Open thePDF file you want to capture, and choose Document > Paper Capture > Start Capture OR choose Document > OCR Text Recognition > Recognize Text Using OCR (which one depends on your vesion of Adobe Acrobat)

2. Specify the pages to be captured.

3. Under Settings, click the Edit button if you want to change the primary optical character recognition (OCR) language, the PDF output style, or the image downsampling.

· For PDF Output Style, choose Searchable Image (Exact) to keep the original image in the foreground and place searchable text behind the image.

· Choose Searchable Image (Compact) to apply compression to the foreground image to reduce file size but also reduce image quality.

· Choose Formatted Text and Graphics to reconstruct the original page using recognized text, fonts, pictures, and other graphic elements.

You can use the Paper Capture command on pages that were scanned or imported with the following resolutions:
Black-and-white images at 200 to 600 dpi (300 dpi is generally optimal).
Grayscale or color images at 200 to 400 dpi.

Note: The primary OCR language menu is available only if you perform a custom installation and choose Roman Paper Capture.

4. In the Paper Capture dialog box, click OK to start the conversion.

Correcting words on captured pages

If you choose the PDF Formatted Text and Graphics format as the PDF Output Style, Acrobat "reads" bitmaps of text and tries to substitute words and characters for the

bitmaps. When it isn't certain of one of its substitutions, it marks the word as a Capture Suspect. Suspects are shown in the PDF as the original bitmap of the word, but the text is

included on an invisible layer behind the bitmap of the word. This makes the word searchable even though it is displayed as a bitmap. You can accept these suspects as they are, or you can use the TouchUp Text tool to correct them.

Note: You must have converted your scanned page to formatted text and graphics before you can correct suspect words.

To review and correct suspect words on captured pages:

1. Do one of the following:

Choose Document > Paper Capture > Find All OCR Suspects.

or choose Document > OCR Text Recognition > Find All OCR Suspects.

All suspect words on the page are enclosed in boxes. Click any suspect word to show the suspect text and its original bitmap image in the Find Element window.

Choose Document > Paper Capture (or OCR Text Recognition) > Find First OCR Suspect. The suspect text and the original bitmap image are shown in the Find Element window.

Note: If you close the Find Element window before correcting all suspect words, you can return to the process by choosing Document > Paper Capture(or OCR Text Recognition) > Find First OCR Suspect, or by clicking any suspect word with the TouchUp Text tool.

Suspect word as it appears highlighted on the page (left) and the suspect word as it appears in the Find Element window (right)

2. Compare the suspect word with the image of the word in the Find Element window.

3. Do one of the following:

To accept the word as correct, click Accept and Find. You move to the next suspect word.

Correct the word, and then click Accept and Find to move to the next suspect word.

To keep the suspect image in place and move to the next suspect word, click Find Next.

4. Review and correct the remaining suspect words, and then close the Find Element window.

Errors

If you get the message "This page contains renderable text.",

it means there is readable text (even one character !) on the page.

Try to Copy the text and Paste it into a new Apple Page or MS Word document.

(Control A or Command A to Select All, Control C or Command C to Copy it, open a new document and Control P or Command P to Paste it into the new document.)

(The Control key is for PCs , the command key is for Macs and is the one with the apple symbol on it. The key labeled Control on the Mac keyboard will not work.)

If that doesn't work find the readable text and delete it, then convert as above.

above from Adobe Acrobat Help pages

Library Main Page

MVRHS 6/29/2010