Your achievements

Level 1

0% to

Level 2

Tip /
Sign in

Sign in to Community

to gain points, level up, and earn exciting badges like the new
Bedrock Mission!

Learn more

View all

Sign in to view all badges

Fulltext search and text/plain files

Avatar

Level 2

 Out-of-the-box AEM / Sling instances do not index [nt:file] documents with text/plain mime type. I've created a custom Lucene index with a Tika configuration node like the one reported below (I tried different class, like TextAndCSVParser; something seems to be wrong. I can survive without full-text search on text/plain files but it's a matter of principle. Any ideas for solving this problem?

 

<properties>
<parsers>
<parser class="org.apache.tika.parser.defaultParser">
<mime>text/plain</mime>
</parser>
</parsers>
</properties>

 

3 Replies

Avatar

Level 2

Is this your config?

 

<?xml version="1.0" encoding="UTF-8"?>
<properties>
<parsers>
<parser class="org.apache.tika.parser.DefaultParser">
<mime>text/plain</mime>
</parser>
</parsers>
</properties>

 

did you follow this post https://helpx.adobe.com/au/experience-manager/kb/how-to-use-custom-tika-configuration.html ?

Avatar

Level 2

Yes, a sort of. There is the xml header and there are some disabled mi e types adding the org.apache.tika.parser.EmptyParser class. 

 

<?xml version="1.0" encoding="UTF-8"?>
<properties>
<parsers>
<parser class="org.apache.tika.parser.defaultParser">
<mime>text/plain</mime>
</parser>

<parser class="org.apache.tika.parser.EmptyParser">
<mime>application/pdf</mime>
<mime>application/vnd.openxmlformats-officedocument.spreadsheetml.sheet</mime>
<mime>application/vnd.ms-excel.sheet.macroenabled.12</mime>
<mime>application/vnd.openxmlformats-officedocument.spreadsheetml.template</mime>
<mime>application/vnd.ms-excel.template.macroenabled.12</mime>
<mime>application/vnd.ms-excel.addin.macroenabled.12</mime>
<mime>application/vnd.ms-excel</mime>
<mime>application/vnd.ms-excel.sheet.binary.macroenabled.12</mime>
</parser>
</parsers>
</properties>

Avatar

Level 2

The problem is still there! Does anyone have the same problem?