Issues with XPath Queries

michaelh2862615

04-02-2019

I have a servlet that I use to search the JCR based on a user-provided query.  I log the string that is to be used as the query before running it.  However, I get a parse error when running the query and the error shows a differently formatted query than the string logged.  Am I doing something wrong or is there a bug in the stack that parses/formats the query?  Below is a servlet that should produce the error and the log messages I get.  The query used is "ma'a salama". req below is the object from SlingHttpServletRequest in the servlet's doGet method.

RequestParameter queryString = req.getRequestParameterMap().getValue("q");

String replace = org.apache.jackrabbit.util.Text.escapeIllegalXpathSearchChars(queryString.getString()).toLowerCase().replace(' ', '%');

String query = String.format(

     "/jcr:root/content/sample/en/home//*[@sling:resourceType='sample/components/page/person-bio' and (jcr:like(fn:lower-case(title), '%%%s%%') or jcr:like(fn:lower-case(bioinfo), '%%%s%%'))]",

     replace,

     replace);

LOG.info(query);

Iterator<Resource> resources = req.getResourceResolver().findResources(query, "xpath");

The logs look like this:

05.02.2019 10:09:56.697 *INFO* [0:0:0:0:0:0:0:1 [1549346996695] GET /bin/servlets/person/ HTTP/1.1] com.sample.servlet.PersonFinderServlet /jcr:root/content/sample/en/home/academics/divisions//*[@sling:resourceType='sample/components/page/person-bio' and (jcr:like(fn:lower-case(title), '%ma'a%salama%') or jcr:like(fn:lower-case(bioinfo), '%ma'a%salama%'))]

05.02.2019 10:09:56.699 *ERROR* [0:0:0:0:0:0:0:1 [1549346996695] GET /bin/servlets/person/ HTTP/1.1] org.apache.sling.engine.impl.SlingRequestProcessorImpl service: Uncaught SlingException

javax.jcr.query.InvalidQueryException: java.text.ParseException: Query:

/jcr:root/content/sample/en/home/academics/divisions//*[@sling:resourceType='sample/components/page/person-bio' and (jcr:like(fn:lower-case(title), '%ma'a(*)%salama%') or jcr:like(fn:lower-case(bioinfo), '%ma'a%salama%'))]; expected: )

Notice how in the query in the error statement has (*) inserted in the middle.  Can anyone shed some light on this?  I am using AEM 6.2 with CFP 18.

Accepted Solutions (1)

Accepted Solutions (1)

Gaurav-Behl

MVP

06-02-2019

check this -- EncodingAndEscaping - Jackrabbit Wiki

Use  Text.escapeIllegalXpathSearchChars(searchTerm).replaceAll("'", "''") + "')]"

String q =

  "/jcr:root/foo/element(*, foo)" +

  "[jcr:contains(@title, '" + Text.escapeIllegalXpathSearchChars(searchTerm).replaceAll("'", "''") + "')]" +

  "[@itemID = '" + itemID.replaceAll("'", "''") + "']";

Answers (5)

Answers (5)

Gaurav-Behl

MVP

07-02-2019

I believe that escaping simply converts ' to \'  (adds a backslash) and then we require replace to make it double because it is xpath query. I can check the API and validate later or may be do a simple test and print it in logs..

michaelh2862615

06-02-2019

That resolved the issue, thanks.  I found the function to escape illegal characters from that exact page, but I was not sure why I would need to to replace characters if they were properly escaped in the first place.  The document doesn't discuss why the replaceAll is there after either.  A little odd.  Now, however, the query contains:

jcr:like(fn:lower-case(@jcr:title), '%ma''a%salama%'

Will this be able to match ma'a salama using two single quotes between the a's?

michaelh2862615

05-02-2019

I would agree but I do escape it, using org.apache.jackrabbit.util.Text.escapeIllegalXpathSearchChars.  I have also tried Text.escapeIllegalJcrChars.  Is there no existing function to escape the query automatically or do I have to write something?

smacdonald2008

05-02-2019

Why use XPath. You should look at QUeryBuilder API.

In fact - a deep dive into using QueryBuilder API is the topic of this month's ask the AEM Community experts hosted by our AEM super user bsloki

Gaurav-Behl

MVP

05-02-2019

To me, the issue is expected: )  because of single quote in the input string

escape it by doubling it, make a single quote to two single quotes and then try.