We have a license for LiveCycle, but I'm trying to find documentation regarding the OCR capabilities that Adobe has.
The goal is as follows: to create a service that receives a file (image or pdf), runs OCR on it and then returns the text within. If it was a PDF passed in then it would be nice to return a PDF with the newly derived text embedded within the PDF.
I guess I want to know:
- Does Adobe have an API for me to use (most likely from C#) that will allow me to OCR a PDF/image?
- If Adobe does OCR, can it embed the derived text?
- Where can I find documentation/examples to do any of this?
You can use Acrobat in an automated service as long as it's licensed properly. You mention you have a license of LiveCycle but don't mention which components. If you have PDF/G it does have an API and does provide the ability to OCR an image and can return the resulting text of the scanned document. The API documentation is located here http://help.adobe.com/en_US/livecycle/10.0/ProgramLC/javadoc/