Fulltext search and text/plain files | Community
Skip to main content
artika4biz
Level 2
May 10, 2022

Fulltext search and text/plain files

  • May 10, 2022
  • 2 replies
  • 1078 views

 Out-of-the-box AEM / Sling instances do not index [nt:file] documents with text/plain mime type. I've created a custom Lucene index with a Tika configuration node like the one reported below (I tried different class, like TextAndCSVParser; something seems to be wrong. I can survive without full-text search on text/plain files but it's a matter of principle. Any ideas for solving this problem?

 

<properties>
<parsers>
<parser class="org.apache.tika.parser.defaultParser">
<mime>text/plain</mime>
</parser>
</parsers>
</properties>

 

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.

2 replies

maxbarrass-anchora
Adobe Champion
Adobe Champion
May 13, 2022

Is this your config?

 

<?xml version="1.0" encoding="UTF-8"?>
<properties>
<parsers>
<parser class="org.apache.tika.parser.DefaultParser">
<mime>text/plain</mime>
</parser>
</parsers>
</properties>

 

did you follow this post https://helpx.adobe.com/au/experience-manager/kb/how-to-use-custom-tika-configuration.html ?

artika4biz
Level 2
May 13, 2022

Yes, a sort of. There is the xml header and there are some disabled mi e types adding the org.apache.tika.parser.EmptyParser class. 

 

<?xml version="1.0" encoding="UTF-8"?>
<properties>
<parsers>
<parser class="org.apache.tika.parser.defaultParser">
<mime>text/plain</mime>
</parser>

<parser class="org.apache.tika.parser.EmptyParser">
<mime>application/pdf</mime>
<mime>application/vnd.openxmlformats-officedocument.spreadsheetml.sheet</mime>
<mime>application/vnd.ms-excel.sheet.macroenabled.12</mime>
<mime>application/vnd.openxmlformats-officedocument.spreadsheetml.template</mime>
<mime>application/vnd.ms-excel.template.macroenabled.12</mime>
<mime>application/vnd.ms-excel.addin.macroenabled.12</mime>
<mime>application/vnd.ms-excel</mime>
<mime>application/vnd.ms-excel.sheet.binary.macroenabled.12</mime>
</parser>
</parsers>
</properties>

artika4biz
Level 2
May 19, 2022

The problem is still there! Does anyone have the same problem?