Can the asset search function support PDF content search | Community
Skip to main content
Level 2
June 22, 2024
Solved

Can the asset search function support PDF content search

  • June 22, 2024
  • 1 reply
  • 704 views

I test searching for asset content. Supports content search in Word, Excel, and PPT, but does not support PDF.

 

AEM integrates Apache Tika, how to achieve PDF content search with minimal modifications?

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by lukasz-m

Hi @qingzhou,

In general pdf content indexing/content search is supported OOTB, see official documentation:

In other words assuming that you did not change anything in OOTB AEM configuration, this should simply work. I have checked that quickly on my AEM 6.5 SP19, and I was able to get pdf file in search results base on phrase included in the document, either using search from Touch UI and in crx/de. If any point you need to modify Apache Tika configuration this can be done as follow:

If this is not working for you there might be an issue with the pdf file itself - how it was created etc, I would suggest to download a sample pdf file from Adobe site for testing purposes.

1 reply

lukasz-m
Community Advisor
lukasz-mCommunity AdvisorAccepted solution
Community Advisor
June 22, 2024

Hi @qingzhou,

In general pdf content indexing/content search is supported OOTB, see official documentation:

In other words assuming that you did not change anything in OOTB AEM configuration, this should simply work. I have checked that quickly on my AEM 6.5 SP19, and I was able to get pdf file in search results base on phrase included in the document, either using search from Touch UI and in crx/de. If any point you need to modify Apache Tika configuration this can be done as follow:

If this is not working for you there might be an issue with the pdf file itself - how it was created etc, I would suggest to download a sample pdf file from Adobe site for testing purposes.

QingZhouAuthor
Level 2
June 27, 2024

Hi,lukasz-m,

We are using AEM6.5.10
Unable to support PDF content search on 6.5.10, Adobe engineer told me that it is supported on 6.5.21
We cannot upgrade to the latest version because there are many secondary developments
How can I modify the code to enable PDF content search in version 6.5.10?