Question: How can I OCR PDF documents? Answer: It is possible to OCR documents in PDF Studio whether they are existing documents or whether you are scanning new documents. See the different steps below: OCR an Existing Document In PDF Studio, Open the existing document you want to OCR. Navigate to the Document Tab > […]
Articles Tagged: OCR
“Discard invisible text” option in PDF Studio
Q: I want to perform a fresh OCR of all pages. Some pages already have invisible text, how can I remove these text and OCR again? A: This option is available in PDF Studio 12 and above, it will removes any previous OCR text that has been added to the page. To use this option, follow the steps below: […]
OCR two different languages at once
Q: Can I OCR two different languages at once in PDF Studio? A: Starting in PDF Studio 11, you can OCR two different languages at once by following the instructions below: 1. Download the languages that you need to download. Go to Edit -> Preferences -> OCR Select Download OCR Languages Check the languages that […]
Deskew (straighten) pages when scanning to PDF
Q: How can I deskew (straighten) a document when scanning so that I can get a better OCR result? A: Starting in PDF Studio 12 (Pro) and above, you can deskew (straighten) a document on scan dialog. On scan dialog, make sure to check the “Auto deskew images” checkbox. See below for the scan dialog when […]
Could not initialize tesseract / OCRBridge when OCRing a document
Q: When I OCR a document, I get errors such as “Could not initialize tesseract” , ” OCR library is not loaded: null” , “unable to initiate OCRBridge”. What are these errors and how can I fix them? A: There are 4 possible reasons why you’ re seeing one of the errors above: If you […]
How to proofread and correct OCRed text in a PDF
Q: After running a PDF through OCR, I need to be able to inspect the result and, if necessary, correct the OCR results. Is it possible to show the text added by the OCR in PDF Studio? A: We don’t have a specific tool or view to allow users to inspect the OCR text yet but we […]
OCRing images on a PDF page that already contains text
Q: If I have a mixed content document, containing some text and some images, can PDF Studio OCR the images only? PDF Studio can handle mixed content pages, i.e, pages that contain both images and text content. PDF Studio will simply ignore any existing text content and perform OCR on the rest of the page, so […]
OCRing PDF Documents: Page contains invisible text
Q: When I OCR a PDF document, I see a message in the dialog, reading “Page [x] contains invisible text… not processed” . What is it and how can I solve it? A: PDF Studio performs OCR page by page and the message you are seeing means that the corresponding page has already been OCRed (using PDF […]
Permission error downloading OCR languages
In PDF Studio 8, when attempting to download OCR language files, you may receive an error looking like: Error: user does not have permission to write to: opt/pdfstudio8/tess/tessdata. OCR language files are downloaded into PDF Studio installation folder and sometimes a user will not have the permission to write in this folder. If you see […]
PDF OCR for Mac, Windows, and Linux
Q: Does PDF Studio, Qoppa’s PDF editor for Mac, Windows and Linux, have an OCR (Optical Character Recognition) function to recognize and add text to PDF documents? A: Yes! OCR was added in version 8 of PDF Studio (Pro edition). PDF Studio Pro can apply OCR to existing PDF documents (turning them into searchable PDFs) […]