Expand my Community achievements bar.

Dive into Adobe Summit 2024! Explore curated list of AEM sessions & labs, register, connect with experts, ask questions, engage, and share insights. Don't miss the excitement.
SOLVED

Disable PDF Tika/Lucene Indexing in AEM6

Avatar

Level 3

Hi,

We are working on disabling PDF indexing with tika/lucene and cannot find a guide for AEM 6.0. Anyone able to explain how to do this?

Basically just need to be able to edit and get the instance to use a custom tika-config.xml

 

Thanks!

Alex

1 Accepted Solution

Avatar

Correct answer by
Level 3

Figured it out -

For anyone trying to do this: https://helpx.adobe.com/experience-manager/kb/how-to-optimize-lucene-index-to-gain-efficiency.html

Need to extract the tika-config.xml from the bundle .jar file, edit it, and merge the changed config file back in. They have an example config file on that page as well.

View solution in original post

6 Replies

Avatar

Correct answer by
Level 3

Figured it out -

For anyone trying to do this: https://helpx.adobe.com/experience-manager/kb/how-to-optimize-lucene-index-to-gain-efficiency.html

Need to extract the tika-config.xml from the bundle .jar file, edit it, and merge the changed config file back in. They have an example config file on that page as well.

Avatar

Level 10

Thanks for posting you solution. Also note that anyone interested in a deeper understanding of Lucene Indexes - there will be a GEMs session on this in Jan . Sign up here: 

AEM Gems - APAC Q&A session - Oak Lucene Indexes

Avatar

Level 8

Rather than doing that, you could potentially utilize the path field of the workflow launcher to prevent it from running.  Maybe try something like this:

/content/dam/(/.*/)renditions/original(?!\\.pdf)\\.+

I haven't verified that this would work, but it certainly would make more sense than extracting a jar, editing a config file etc etc.

Avatar

Employee

Hi Alex,

did you get this to work? The reason I ask is that the article is written for AEM5.x/CRX2, so not sure it would still apply to AEM6. Might be worth raising a daycare ticket to find out if this is still supported.

Regards,

Opkar

Avatar

Level 3

Opkar, please see my follow up above - it works

 

Thanks,

Alex

Avatar

Employee

Hi Alex,

I saw your reply, I was just wondering if this is a supported configuration for AEM6, best to check with daycare. As it may cause issues if you deploy this in production and daycare say it is not a supported configuration.

Regards,

Opkar