Assets searched with the text in pdf file | Community
Skip to main content
Level 3
March 3, 2023

Assets searched with the text in pdf file

  • March 3, 2023
  • 1 reply
  • 1441 views

Hi Team,

 

I want to search the PDF assets with its contents in AEM Clouds. for example, if my xyz.pdf file has text as "Avocado Breakfast" in it and when I am searching in DAM with Avocado breakfast text, I should get this PDF as results. 

Just FYI, there is no tag assigned as Avocado breakfast to the pdf in DAM.

 

Thanks,

SD

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.

1 reply

Saravanan_Dharmaraj
Community Advisor
Community Advisor
March 3, 2023

The OOTB AEM DAM search should serve the search results(pdf) based on what is in the content of the pdf. So if you search "Avocado Breakfast", that xys.pdf should come up in search.

 

"Behind the scenes, Apache Lucene fetches the documents in the repository and indexes the content based on the metadata and text content. The index update thread wakes up every five seconds looking for content updates. Apache Lucene uses Apache Tika, a content analysis tool, to get the internal detail of documents like metadata and text in the document to create the indexes."

 

https://blogs.perficient.com/2017/05/08/indexing-bogging-aem-down-disable-apache-tika/ 

SDusaneAuthor
Level 3
March 6, 2023

I am searching with text content but it is not searchable by default, how can I resolve it?