Expand my Community achievements bar.

SOLVED

Extremely poor performance of query to get top 25 rows

Avatar

Level 5

We are using paging to avoid loading all records at same time. However, when we execute the query to get top 25 rows, it still returns all the rows (6000) and shows 25. But then it makes no sense since it's taking around 44 seconds for first query. How to optimize it?

sample query is below. First query execution time is 44 sec. Subsequent is around 5. But still seems v high for such a query. Anything we need to change here to speed it up?

path=/content/imf/publishedDocuments
type=nt:unstructured
orderby=@documentDate
orderby.sort=desc
p.offset=0
p.limit=25

1 Accepted Solution

Avatar

Correct answer by
Employee Advisor

Hi,

The problem with your query (no matter if you use QueryBuilder or XPATH/SQL2 directly) is the fact, that all the results are fetched from the repository, then filtered and ordered, and the first 25 rows are returned. This takes that much time.

The Jackrabbit/OAK implementation of the query uses a lazy loading approach. That means, it fetches a number of results from the repository and then returns an iterator. If you want to read more results than already loaded, the iterator goes back to the repo and fetches the next results and so on. But this only works if you don't request and ordering. Because for ordering the query engine needs to fetch all results from the repo and then do the ordering based on the complete result set.

So in AEM 5.x it's really hard to speedup this query, the only chance I see is to reduce the total amount of resulting nodes, which are required to be ordered. in AEM 6.x you can create a special index containing path constraints and lots of other features to speed up search.

kind regards,
Jörg

View solution in original post

6 Replies

Avatar

Level 3
If you haven't already, Try using sql instead of xpath query.

Avatar

Employee

If this is AEM 5.x, then the path will be checked after the rest of the predicates have been evaluated, as you have a generic search it may be matching a lot of content. To optimise the query, it would be best to add a property/tag value to the search. 

Regards,

Opkar

Avatar

Correct answer by
Employee Advisor

Hi,

The problem with your query (no matter if you use QueryBuilder or XPATH/SQL2 directly) is the fact, that all the results are fetched from the repository, then filtered and ordered, and the first 25 rows are returned. This takes that much time.

The Jackrabbit/OAK implementation of the query uses a lazy loading approach. That means, it fetches a number of results from the repository and then returns an iterator. If you want to read more results than already loaded, the iterator goes back to the repo and fetches the next results and so on. But this only works if you don't request and ordering. Because for ordering the query engine needs to fetch all results from the repo and then do the ordering based on the complete result set.

So in AEM 5.x it's really hard to speedup this query, the only chance I see is to reduce the total amount of resulting nodes, which are required to be ordered. in AEM 6.x you can create a special index containing path constraints and lots of other features to speed up search.

kind regards,
Jörg

Avatar

Level 5

You mean JCR-SQL2?

Can you translate a sample and let me try to run it

Avatar

Level 10

What version of AEM are you using. If you are using AEM 6 - you need to look into OAK indexing. 

Avatar

Level 10

You can also use 'query' under tools in crx/de to test your sql2.

Also as @Scott mentioned, AEM version matters as in AEM 6.0 indexes would not be created and would will have to create the indexes. In AEM 6.x, you can also use Query Manager to evaluate your query.