“Discard invisible text” option in PDF Studio

Q: I want to perform a fresh OCR of all pages. Some pages already have invisible text, how can I remove these text and OCR again? A: This option is available in PDF Studio 12 and above, it will removes any previous OCR text that has been added to the page. To use this option, follow the steps below: […]

Read More

OCR two different languages at once

Q: Can I OCR two different languages at once in PDF Studio? A: Starting in PDF Studio 11, you can OCR two different languages at once by following the instructions below: 1. Download the languages that you need to download. Go to Edit -> Preferences -> OCR Select Download OCR Languages Check the languages that […]

Read More

Deskew (straighten) pages when scanning to PDF

Q: How can I deskew (straighten) a document when scanning so that I can get a better OCR result? A: Starting in PDF Studio 12 (Pro) and above, you can deskew (straighten) a document on scan dialog. On scan dialog, make sure to check the “Auto deskew images” checkbox. See below for the scan dialog when […]

Read More

Could not initialize tesseract / OCRBridge when OCRing a document

Q: When I OCR a document, I saw some errors “Could not initialize tesseract” , ” OCR library is not loaded: null” , “unable to initiate OCRBridge”. What are these errors and how can I fix these errors? A: The reason you’ re seeing these errors is because you have rename/moved/deleted the “tess” folder under your user […]

Read More

How to proofread and correct OCRed text in a PDF

Q: After running a PDF through OCR, I need to be able to inspect the result and, if necessary, correct the OCR results.  Is it possible to show the text added by the OCR in PDF Studio? A: We don’t have a specific tool or view to allow users to inspect the OCR text yet but we […]

Read More

OCRing images on a PDF page that already contains text

Q: If I have a mixed content document, containing some text and some images, can PDF Studio OCR the images only? PDF Studio can handle mixed content pages, i.e, pages that contain both images and text content. PDF Studio will simply ignore any existing text content and perform OCR on the rest of the page, so […]

Read More

OCRing PDF Documents: Page contains invisible text

Q: When I OCR a PDF document, I see a message in the dialog, reading “Page [x] contains invisible text… not processed” . What is it and how can I solve it? A: PDF Studio performs OCR page by page and the message you are seeing means that the corresponding page has already been OCRed (using PDF […]

Read More

Permission error downloading OCR languages

In PDF Studio 8, when attempting to download OCR language files, you may receive an error looking like: Error: user does not have permission to write to: opt/pdfstudio8/tess/tessdata. OCR language files are downloaded into PDF Studio installation folder and sometimes a user will not have the permission to write in this folder. If you see […]

Read More

PDF OCR for Mac, Windows, and Linux

Q: Does PDF Studio, Qoppa’s PDF editor for Mac, Windows and Linux, have an OCR (Optical Character Recognition) function to recognize and add text to PDF documents? A: Yes! OCR was added in version 8 of PDF Studio (Pro edition). PDF Studio Pro can apply OCR to existing PDF documents (turning them into searchable PDFs) […]

Read More

OCR Languages

Integrated OCR Languages: Within PDF Studio, it is possible to download and install the following languages for OCR. 5 most common languages: English French German Italian Spanish Other languages available: Danish Dutch Finnish Norwegian Polish Portuguese Swedish Non-Latin Languages (including CJK) For PDF Studio 10 & below For advanced users only, in addition to the […]

Read More