“Discard invisible text” option in PDF Studio

Q: I want to perform a fresh OCR of all pages. Some pages already have invisible text, how can I remove these text and OCR again? A: This option is available in PDF Studio 12 and above, it will removes any previous OCR text that has been added to the page. To use this option, follow the steps below: […]

Read More

OCR two different languages at once

Q: Can I OCR two different languages at once in PDF Studio? A: Starting in PDF Studio 11, you can OCR two different languages at once by following the instructions below: 1. Download the languages that you need to download. Go to Edit -> Preferences -> OCR Select Download OCR Languages Check the languages that […]

Read More

Could not initialize tesseract / OCRBridge when OCRing a document

Q: When I OCR a document, I saw some errors “Could not initialize tesseract” , ” OCR library is not loaded: null” , “unable to initiate OCRBridge”. What are these errors and how can I fix these errors? A: The reason you’ re seeing these errors is because you have rename/moved/deleted the “tess” folder under your user […]

Read More

OCR for Non-Latin Languages & Multi-Language OCR

PDF Studio 11 comes with a new OCR engine with support for non-Latin and CJK languages. New Latin languages will also be added as well to the available list of languages. The complete list of new OCR languages can be found below. In addition to the new languages, PDF Studio 11 also has the ability […]

Read More

How to proofread and correct OCRed text in a PDF

Q: After running a PDF through OCR, I need to be able to inspect the result and, if necessary, correct the OCR results.  Is it possible to show the text added by the OCR in PDF Studio? A: We don’t have a specific tool or view to allow users to inspect the OCR text yet but we […]

Read More

OCRing images on a PDF page that already contains text

Q: If I have a mixed content document, containing some text and some images, can PDF Studio OCR the images only? PDF Studio can handle mixed content pages, i.e, pages that contain both images and text content. PDF Studio will simply ignore any existing text content and perform OCR on the rest of the page, so […]

Read More

OCRing PDF Documents: Page contains invisible text

Q: When I OCR a PDF document, I see a message in the dialog, reading “Page [x] contains invisible text… not processed” . What is it and how can I solve it? A: PDF Studio performs OCR page by page and the message you are seeing means that the corresponding page has already been OCRed (using PDF […]

Read More

OCR a Batch of PDF Documents

Q: How can I OCR a bunch of PDF documents all at once? A: PDF Studio 9 and above comes with a Batch OCR Option that allows you to OCR multiple PDF files at once. This is useful if you need to add text to a large number of documents. To OCR multiple PDFs using the Batch OCR […]

Read More

How to OCR a PDF Document to add Searchable Text

From Existing Document Launch PDF Studio and open the PDF document that you wish to add searchable text to Go to Document ->OCR – Create Searchable PDF from the top menu From the Language drop down select the language you wish to use  Note: The first time using OCR you will need to download the language packs. To do […]

Read More

Permission error downloading OCR languages

In PDF Studio 8, when attempting to download OCR language files, you may receive an error looking like: Error: user does not have permission to write to: opt/pdfstudio8/tess/tessdata. OCR language files are downloaded into PDF Studio installation folder and sometimes a user will not have the permission to write in this folder. If you see […]

Read More