We are using AEM 6.3 SP1 and have cases where using SimpleSearch has brought down our AEM instances with one user doing a query with unicode characters. A simple example is something like
import com.day.cq.search.SimpleSearch; import com.day.cq.search.Predicate; import com.day.cq.search.result.SearchResult; SimpleSearch simpleSearch = resource.adaptTo(SimpleSearch.class); simpleSearch.setQuery("Knights of Columbus (@KofC) | TwitterPlan de Mobilité d Entreprise - PDFAllt om Bilar – Sveriges största motorsajt | Expressen | Allt om BilarStandard Bank Online Banking - Search Results | TimeErrors | Developers"); Predicate p = new Predicate("path", "/content"); simpleSearch.addPredicate(p); SearchResult searchResult = simpleSearch.getResult();
That is an example of a query that made one of our AEM nodes completely unresponsive. We never ran into this issue using AEM 5.6.1, but it has happened a few times with various unicode characters in AEM 6.3. Currently we are working around it by stripping out all non-ASCII characters before doing the search. Removing the unicode characters, the queries are very fast.
Is there a permanent fix for this?
Edit: Sort of off topic, if anyone can let me know how to format the code so it shows on multiple lines I would appreciate it.
Solved! Go to Solution.
Hi,
Please have a look at SimpleSearch ("The Adobe AEM Quickstart and Web Application.")
void setQuery(String query), accepts String type. Try unicode characters directly in strings in the code, by escaping the with \u.
// The danish letters Æ Ø Å
String myString = "\u00C6\u00D8\u00C5" ;
Hi,
Please have a look at SimpleSearch ("The Adobe AEM Quickstart and Web Application.")
void setQuery(String query), accepts String type. Try unicode characters directly in strings in the code, by escaping the with \u.
// The danish letters Æ Ø Å
String myString = "\u00C6\u00D8\u00C5" ;
That does seem to work better, but how would I achieve this without a String literal? This is input that is coming in from a http query parameter so I can't just use a String literal like that.
Views
Replies
Total Likes
Can you enable debug logs for the search and help me with the query that is generated at the end? Maybe we can somehow tune the index definitions and make it better performing ?
Views
Replies
Total Likes
I was thinking the unicode literal encoding was helping, but I later found that it was doing some weird String conversions and ended up just turning the unicode characters to question marks, so it really wasn't helping like I thought it was.
This particular search seems to break AEM in mutliple ways. One being the unicode characters, the other seems to be just the sheer number of words in the search. Even after removing the unicode characters I have to knock it down to "Knights of Columbus KofC TwitterPlan de Mobilit d Entreprise - PDFAllt om Bilar Sveriges strsta motorsajt Expressen Allt om BilarStandard Bank" before the search doesn't take down the instance.
16.05.2018 09:49:15.019 *INFO* [127.0.0.1 [1526482153050] GET /content/website/Search.html HTTP/1.1] com.day.cq.search.ext.impl.SimpleSearchImpl SimpleSearch is searching with the types: [cq:Page, dam:Asset] 16.05.2018 09:49:15.023 *DEBUG* [127.0.0.1 [1526482153050] GET /content/website/Search.html HTTP/1.1] com.day.cq.search.impl.builder.QueryImpl executing query (URL): 13_group.group.1_path=%2fcontent%2fwebsite&13_group.group.p.or=true&14_group.2_group.1_path=%2fcontent%2fdam%2fotherwebsite%2fcenter%2fresource&14_group.2_group.1_path.self=true&14_group.2_group.2 _path=%2fcontent%2fotherwebsite%2fportal%2fauthenticated&14_group.2_group.2_path.self=true&14_group.2_group.p.not=true&14_group.2_group.p.or=true&15_group.p.not=true&15_group.primaryType=jcr%3acontent%2fjcr% 3aprimaryType&15_group.primaryType.value=nt%3aunstructured&16_group.hideInNav=jcr%3acontent%2fhideInNav&16_group.hideInNav.value=true&16_group.p.not=true&group.0_fulltext=Knights%20of%20Columbus%20(%40KofC)%20%7 c%20TwitterPlan%20de%20Mobilit%c3%a9%20d%20Entreprise%20-%20PDFAllt%20om%20Bilar%20%e2%80%93%20Sveriges%20st%c3%b6rsta%20motorsajt%20%7c%20Expressen%20%7c%20Allt%20om%20BilarStandard%20Bank%20Online%20Banking%20 -%20Search%20Results%20%7c%20TimeErrors%20%7c%20Developers&group.0_fulltext.relPath=&group.1_fulltext=Knights%20of%20Columbus%20(%40KofC)%20%7c%20TwitterPlan%20de%20Mobilit%c3%a9%20d%20Entreprise%20-%20PDFAllt%2 0om%20Bilar%20%e2%80%93%20Sveriges%20st%c3%b6rsta%20motorsajt%20%7c%20Expressen%20%7c%20Allt%20om%20BilarStandard%20Bank%20Online%20Banking%20-%20Search%20Results%20%7c%20TimeErrors%20%7c%20Developers&group.1_fu lltext.relPath=%40jcr%3atitle&group.2_fulltext=Knights%20of%20Columbus%20(%40KofC)%20%7c%20TwitterPlan%20de%20Mobilit%c3%a9%20d%20Entreprise%20-%20PDFAllt%20om%20Bilar%20%e2%80%93%20Sveriges%20st%c3%b6rsta%20mot orsajt%20%7c%20Expressen%20%7c%20Allt%20om%20BilarStandard%20Bank%20Online%20Banking%20-%20Search%20Results%20%7c%20TimeErrors%20%7c%20Developers&group.2_fulltext.relPath=%40jcr%3adescription&group.p.or=true&lan guages=&lastModified.lowerBound=&lastModified.property=jcr%3acontent%2fcq%3alastModified&lastModified.upperBound=&mimeTypes=jcr%3acontent%2fjcr%3amimeType&mimeTypes.value=&nodeTypes.p.or=true&nodeTypes.type=dam% 3aAsset&orderByScore=%40jcr%3ascore&orderByScore.sort=desc&p.excerpt=true&p.limit=10&p.offset=0&path=%2fcontent&tags=&tags.property=jcr%3acontent%2fcq%3atags 16.05.2018 09:49:15.023 *DEBUG* [127.0.0.1 [1526482153050] GET /content/website/Search.html HTTP/1.1] com.day.cq.search.impl.builder.QueryImpl executing query (predicate tree): ROOT=group: limit=10, offset=0, excerpt=true[ {group=group: or=true[ {0_fulltext=fulltext: fulltext=Knights of Columbus (@KofC) | TwitterPlan de Mobilité d Entreprise - PDFAllt om Bilar – Sveriges största motorsajt | Expressen | Allt om BilarStandard Bank Online Banking - Search Results | TimeErrors | Developers, relPath=} {1_fulltext=fulltext: fulltext=Knights of Columbus (@KofC) | TwitterPlan de Mobilité d Entreprise - PDFAllt om Bilar – Sveriges största motorsajt | Expressen | Allt om BilarStandard Bank Online Banking - Search Results | TimeErrors | Developers, relPath=@jcr:title} {2_fulltext=fulltext: fulltext=Knights of Columbus (@KofC) | TwitterPlan de Mobilité d Entreprise - PDFAllt om Bilar – Sveriges största motorsajt | Expressen | Allt om BilarStandard Bank Online Banking - Search Results | TimeErrors | Developers, relPath=@jcr:description} ]} {path=path: path=/content} {languages=language: language=null} {tags=tagid: property=jcr:content/cq:tags, tagid=null} {mimeTypes=property: property=jcr:content/jcr:mimeType, value=null} {lastModified=daterange: property=jcr:content/cq:lastModified, lowerBound=null, upperBound=null} {orderByScore=orderby: orderby=@jcr:score, sort=desc} {13_group=group: [ {group=group: or=true[ {1_path=path: path=/content/website} ]} ]} {14_group=group: [ {2_group=group: not=true, or=true[ {1_path=path: path=/content/dam/otherwebsite/center/resource, self=true} {2_path=path: path=/content/otherwebsite/portal/authenticated, self=true} ]} ]} {15_group=group: not=true[ {primaryType=property: property=jcr:content/jcr:primaryType, value=nt:unstructured} ]} {16_group=group: not=true[ {hideInNav=property: property=jcr:content/hideInNav, value=true} ]} {nodeTypes=group: or=true[ {type=type: type=cq:Page} {type=type: type=dam:Asset} ]} ] 16.05.2018 09:49:15.040 *DEBUG* [127.0.0.1 [1526482153050] GET /content/website/Search.html HTTP/1.1] com.day.cq.search.impl.builder.QueryImpl xpath query: (/jcr:root/content/website//element(*, cq:Page)[(jcr:contains(., 'Knights of Columbus (@KofC) | TwitterPlan de Mobilité d Entreprise - PDFAllt om Bilar – Sveriges största motorsajt | Expressen | Allt om BilarStandard Bank Online Banking - Search Results | TimeErrors | Developers') or jcr:contains(@jcr:title, 'Knights of Columbus (@KofC) | TwitterPlan de Mobilité d Entreprise - PDFAllt om Bilar – Sveriges största motorsajt | Expressen | Allt om BilarStandard Bank Online Banking - Search Results | TimeErrors | Developers') or jcr:contains(@jcr:description, 'Knights of Columbus (@KofC) | TwitterPlan de Mobilité d Entreprise - PDFAllt om Bilar – Sveriges största motorsajt | Expressen | Allt om BilarStandard Bank Online Banking - Search Results | TimeErrors | Developers')) and not(jcr:content/@jcr:primaryType = 'nt:unstructured') and not(jcr:content/@hideInNav = 'true')] | /jcr:root/content/website//element(*, dam:Asset)[(jcr:contains(., 'Knights of Columbus (@KofC) | TwitterPlan de Mobilité d Entreprise - PDFAllt om Bilar – Sveriges största motorsajt | Expressen | Allt om BilarStandard Bank Online Banking - Search Results | TimeErrors | Developers') or jcr:contains(@jcr:title, 'Knights of Columbus (@KofC) | TwitterPlan de Mobilité d Entreprise - PDFAllt om Bilar – Sveriges största motorsajt | Expressen | Allt om BilarStandard Bank Online Banking - Search Results | TimeErrors | Developers') or jcr:contains(@jcr:description, 'Knights of Columbus (@KofC) | TwitterPlan de Mobilité d Entreprise - PDFAllt om Bilar – Sveriges största motorsajt | Expressen | Allt om BilarStandard Bank Online Banking - Search Results | TimeErrors | Developers')) and not(jcr:content/@jcr:primaryType = 'nt:unstructured') and not(jcr:content/@hideInNav = 'true')])/rep:excerpt(.) order by @jcr:score descending, @jcr:score descending 16.05.2018 09:49:15.058 *DEBUG* [127.0.0.1 [1526482153050] GET /content/website/Search.html HTTP/1.1] com.day.cq.search.impl.builder.QueryImpl xpath query creation took 33 ms ... 16.05.2018 10:19:47.524 *DEBUG* [127.0.0.1 [1526482153050] GET /content/website/Search.html HTTP/1.1] com.day.cq.search.impl.builder.QueryImpl entire query execution took 1832501 ms
Views
Replies
Total Likes
Kunwar, have you had a chance to look at this?
Views
Replies
Total Likes
Kunwar, have you had a chance to look at this?
Views
Replies
Total Likes
We've run into the same problem. In our case the search string was relatively short: классификация+ЮНКТАД .
Views
Replies
Total Likes
We figured out that, in our case, our problem was we were first converting the user's string to a byte array (via UTF-8), and then converting it back to a string (via ascii). The resulting garbage string was what we fed into SimpleSearch. It works if the original string uses ascii characters, but if not, the result is a live-lock of some kind. So probably a jackrabbit/lucene bug, in some sense, but we had bad code that triggered it.
Views
Replies
Total Likes
thanks for posting the information!
Views
Replies
Total Likes
Views
Like
Replies
Views
Likes
Replies
Views
Likes
Replies