Expand my Community achievements bar.

Learn about Edge Delivery Services in upcoming GEM session

Fulltext search and text/plain files

Avatar

Level 2

 Out-of-the-box AEM / Sling instances do not index [nt:file] documents with text/plain mime type. I've created a custom Lucene index with a Tika configuration node like the one reported below (I tried different class, like TextAndCSVParser; something seems to be wrong. I can survive without full-text search on text/plain files but it's a matter of principle. Any ideas for solving this problem?

 

<properties>
<parsers>
<parser class="org.apache.tika.parser.defaultParser">
<mime>text/plain</mime>
</parser>
</parsers>
</properties>

 

3 Replies

Avatar

Adobe Champion

Is this your config?

 

<?xml version="1.0" encoding="UTF-8"?>
<properties>
<parsers>
<parser class="org.apache.tika.parser.DefaultParser">
<mime>text/plain</mime>
</parser>
</parsers>
</properties>

 

did you follow this post https://helpx.adobe.com/au/experience-manager/kb/how-to-use-custom-tika-configuration.html ?

Avatar

Level 2

Yes, a sort of. There is the xml header and there are some disabled mi e types adding the org.apache.tika.parser.EmptyParser class. 

 

<?xml version="1.0" encoding="UTF-8"?>
<properties>
<parsers>
<parser class="org.apache.tika.parser.defaultParser">
<mime>text/plain</mime>
</parser>

<parser class="org.apache.tika.parser.EmptyParser">
<mime>application/pdf</mime>
<mime>application/vnd.openxmlformats-officedocument.spreadsheetml.sheet</mime>
<mime>application/vnd.ms-excel.sheet.macroenabled.12</mime>
<mime>application/vnd.openxmlformats-officedocument.spreadsheetml.template</mime>
<mime>application/vnd.ms-excel.template.macroenabled.12</mime>
<mime>application/vnd.ms-excel.addin.macroenabled.12</mime>
<mime>application/vnd.ms-excel</mime>
<mime>application/vnd.ms-excel.sheet.binary.macroenabled.12</mime>
</parser>
</parsers>
</properties>

Avatar

Level 2

The problem is still there! Does anyone have the same problem?