Assets using tags with special characters are not returned using omnisearchbar | Community
Skip to main content
Level 2
June 26, 2020
Solved

Assets using tags with special characters are not returned using omnisearchbar

  • June 26, 2020
  • 3 replies
  • 3411 views

Hi,

If I create a tag with the name "pröva", AEM will create a cq:tag with:

  • the jcr:name set to "prova" - i.e. the 'ö' is stripped out and replace with 'o'
  • the jcr:title set to "pröva" - i.e. the 'ö' is not stripped outI

If I now tag an asset with this tag, the asset nodes metadata\cq:tags property references the jcr:name of the tag and not the jcr:title of the tag, e.g. the asset will now have cq:tags set to "prova".

This means when using the omnisearchbar to try and retrieve the asset using the native spelling of the tag (i.e. searching for "pröva") nothing is returned.

This is because the omnisearchbar performs the following query:

(/jcr:root/content/dam//element(*, nt:folder)[(jcr:contains(., 'pröva'))] | /jcr:root/content/dam//element(*, dam:Asset)[(jcr:contains(., 'pröva'))])

And of course jcr:contains will only look at the cq:tags property and won't find pröva anywhere.

I have contacted Adobe support, but they told me to " update the indexes definition so that tags jcr:title are also part of the index" which as far as I can tell is a useless response.

Are the only options:

  • extend the omnisearch functionality to actually work with foreign tags (possibly by stripping out the foreign chars), or
  • every time someone tags an assets, also store the cq:tag title (not name) field as a hidden field, so that jcr:contains returns properly against special chars

Appreciate any guidance, as Adobe support were unable to help, despite insisting AEM has full support for languages.

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by joerghoh

Hi Tim,

 

In other words, the problem is just looking for the name of the tags (because it's part of the reference to the tag), and not the jcr:title of the tags, which are referenced from the assts. Is that correct?

 

From a technical point of view this is expected, because JCR imposes some limitations on node names. That means if jcr:title does not match the name, you won't find the asset. Another limitation is also, that if you change the title of a tag but not its name, and you are using this new name to find any asset, you won't get any result.

In both cases it's the problem that the jcr:title of a referenced path is not considered for search, thus more likely a missing feature than a bug.

 

 

3 replies

VeenaVikraman
Community Advisor
Community Advisor
June 26, 2020

 @joerghoh Any help you can give for this question ? 

Adobe Employee
June 26, 2020

Search is built on Lucene.

 

So this isn't so much an omni-search issue rather than the index definitions are not tokenizing the characters.

For example, ootb AEM will not tokenize Chinese. 

 

You need to look at adding an analyzer to your index definition to handle your Swedish word 

https://lucene.apache.org/core/4_7_0/analyzers-common/overview-summary.html

 

You can probably get away with the StandardAnalyzer.

 

There's some rough steps on how you can do this if you scroll most the way down this article : 

Look for this section "SPECIFYING THE ANALYZER CLASS DIRECTLY"

https://helpx.adobe.com/ca/experience-manager/6-3/sites/deploying/using/queries-and-indexing.html#Configuringtheindexes

 

 

Good luck.

 

Level 2
June 26, 2020
Thanks aemmarc. Unless I'm missing something, I'm not sure that would help. The tag is stored on the asset as "prova" i.e. with no special characters anyway. There are no special characters to add to an index based on the example I've given - it's a problem with how AEM links a tagged asset back to a tag. There also may be issues indexing special characters, but that would be the next issue to face.
joerghoh
Adobe Employee
joerghohAdobe EmployeeAccepted solution
Adobe Employee
June 27, 2020

Hi Tim,

 

In other words, the problem is just looking for the name of the tags (because it's part of the reference to the tag), and not the jcr:title of the tags, which are referenced from the assts. Is that correct?

 

From a technical point of view this is expected, because JCR imposes some limitations on node names. That means if jcr:title does not match the name, you won't find the asset. Another limitation is also, that if you change the title of a tag but not its name, and you are using this new name to find any asset, you won't get any result.

In both cases it's the problem that the jcr:title of a referenced path is not considered for search, thus more likely a missing feature than a bug.

 

 

Level 2
June 30, 2020
It might be expected by you and me Jorg, being familiar with the intricacies of JCR, but it's not expected by a customer that was told there is full Swedish language support! I politely disagree its a missing feature: tags and searching for assets tagged currently do not work OOTB as expected = bug. Regardless of semantics, the user is unable to use the product. What would you recommend is the best approach to fix this - customisation of search bar or having to duplicate and store the correct text as a hidden field on each asset? (P.S. I don't love that this forum allows admins to randomly mark topics as correctly answered. I think as the question answer I should be the judge of that, no?!)