Your achievements

Level 1

0% to

Level 2

Tip /
Sign in

Sign in to Community

to gain points, level up, and earn exciting badges like the new
Bedrock Mission!

Learn more

View all

Sign in to view all badges

SOLVED

Search for text in Scanned(OCRed) pdf documents.

vishalv75424481
Level 2
Level 2

I want to develop a component for pdf(asset) searching. The existing QueryBuilder API does not search thru  Scanned and OCRed PDF files. It searches only normal PDF files. Is there a way I can achieve this?

1 Accepted Solution
smacdonald2008
Correct answer by
Level 10
Level 10

Out of the box with AEM, this is not supported. You would need to use a Java lib that is able to perform this task (assuming that a Java API exists that can do this job) and build a custom AEM service. This Java API looks like it may be the way to proceed with this use case. 

http://asprise.com/royalty-free-library/java-ocr-api-overview.html

View solution in original post

1 Reply
smacdonald2008
Correct answer by
Level 10
Level 10

Out of the box with AEM, this is not supported. You would need to use a Java lib that is able to perform this task (assuming that a Java API exists that can do this job) and build a custom AEM service. This Java API looks like it may be the way to proceed with this use case. 

http://asprise.com/royalty-free-library/java-ocr-api-overview.html

View solution in original post