Is it true that content in Assets like PDFs are not indexed in AEMaaCS? I can see the tika is being used and pdf's are not filtered out in the ootb config.
Why is the content not being searched? Are we required to use a 3rd party search tool like Elastic/Solr for this functionality?
Solved! Go to Solution.
Topics help categorize Community content and increase your ability to discover relevant content.
Views
Replies
Total Likes
please check these details once :
In Adobe Experience Manager as a Cloud Service (AEMaaCS), full-text indexing of binary files like PDFs is supported out of the box. This is done using Apache Tika, which is capable of extracting text from various file formats including PDF.
Indexing Configuration: Check your Oak Index definitions to make sure that full-text indexing is enabled for nt:file nodes (which is what AEM uses to store binary files like PDFs). You can do this in the CRXDE Lite.
some times pdf files may not have full text extract feature- this may also required to check once.
https://suman-shekhar.medium.com/aem-text-extraction-using-apache-tika-d0eb740eec39
please check these details once :
In Adobe Experience Manager as a Cloud Service (AEMaaCS), full-text indexing of binary files like PDFs is supported out of the box. This is done using Apache Tika, which is capable of extracting text from various file formats including PDF.
Indexing Configuration: Check your Oak Index definitions to make sure that full-text indexing is enabled for nt:file nodes (which is what AEM uses to store binary files like PDFs). You can do this in the CRXDE Lite.
some times pdf files may not have full text extract feature- this may also required to check once.
https://suman-shekhar.medium.com/aem-text-extraction-using-apache-tika-d0eb740eec39
Please cross-check is there is any custom index is created for damAssetLucene. If yes, please sure tika configs are created as per :
Content Search and Indexing | Adobe Experience Manager
Views
Likes
Replies