Level 4

Question

GraphQL query shows up in Query Recorder logs with large scans, but the query should be performant

Forum|Forum|2 months ago
March 11, 2026
4 replies
43 views

We have some GraphQL persisted queries that show up in the AEM Query Recorder logs as having large scans to return a single result. Here’s an example of what we see in the logs.

[72.135.49.53 [1773258109390] POST /graphql/execute.json/mycompany/getProductDetails HTTP/1.1] org.apache.jackrabbit.oak.query.stats.QueryRecorder count:	2400	query:	SELECT main.* FROM [dam:IndexedFragmentData] AS main WHERE ISDESCENDANTNODE(main, 'x') AND main.[@string@model] = 'x' AND (name(main) = 'x') AND ((ISDESCENDANTNODE(main, 'x')) AND (main.[string@serviceCode] = 'x') AND (main.[jcr:primaryType] IS NOT NULL) AND (main.[jcr:primaryType] IS NOT NULL) AND (main.[jcr:primaryType] IS NOT NULL)) ORDER BY main.[jcr:path] OPTION (INDEX TAG[contentFragments], TRAVERSAL FAIL)

What’s jumping out to me is the “QueryRecorder count: 2400” when there is only 1 record returned. And when I run the same query locally and look at its performance in the Query Performance tool it shows as scanning only 1 row. So I’m wondering how this could happen. Any help would be appreciated as we don’t have access to the Query Performance tools on Publish in Cloud environments. I have opened a ticket with Adobe as well, but I’m curious if anyone has seen this.

AmitVishwakarma

Community Advisor

Hi @Preston-3

You're reading that QueryRecorder line correctly, but the count: 2400 part is easy to misunderstand:

[... ] org.apache.jackrabbit.oak.query.stats.QueryRecorder 
  count: 2400 
  query: SELECT main.* FROM [dam:IndexedFragmentData] ...

That count is the cumulative execution count of that exact SQL2 statement on that pod, not "rows scanned for this call".

So in your case:

The GraphQL persisted query returns 1 record.
Locally, the Query Performance tool shows scan = 1 row for a single run (using the proper CF index).
On Cloud Publish, QueryRecorder count: 2400 just means:
- this same persisted query has been executed ~2400 times since the pod (or QueryRecorder stats) started.

It does not mean "this one request scanned 2400 nodes".

How to check if there's a real performance problem

Ignore count for performance diagnosis and instead look for:

Traversal / read-limit warnings in logs, e.g.
- Index-Traversed XXXXX nodes…
- The query read or traversed more than 100000 nodes…
Whether the query is using an index (your example has
- OPTION (INDEX TAG [contentFragments], TRAVERSAL FAIL), which is ideal).
Query Performance on a representative, large-ish environment:
- Check plan, nodes scanned/read, not the count metric.

If there are no traversal/read-limit warnings and the plan uses the CF index with low node scans, the query is healthy; a high count just reflects that it's popular.

Amit Vishwakarma - Adobe Commerce Champion 2025 | 16x Adobe certified | 4x Adobe SME

Preston-3Author

Level 4

Like I said in the other comment it’s a the other commenter seems to be saying something exactly the opposite. Do you have documentation for the idea that the nodes scanned is cumulative for queries like this?

I haven’t noticed any related traversal warnings in the logs, nor errors. We can’t check the query performance on Publish as such, because we obviously don’t have access to query tools on Publish.

VishalKa5

Level 6

Hi @Preston-3 ,

In Adobe Experience Manager (AEM) Cloud Service, this can happen due to how Oak records query scans.

Key Points:

QueryRecorder count shows the number of nodes scanned, not the number of results returned.
Local vs Cloud difference may occur because indexes or content size differ.
GraphQL queries use the contentFragments index, and if it is not fully optimized, Oak may scan more nodes.
Publish environments usually have more data, which can increase scan counts.
Since Query Performance tools are not available in Cloud Publish, raising a ticket with Adobe is the correct step.

Conclusion: A large scan count in logs does not always mean poor performance; it mainly indicates how many nodes Oak checked during the query.

Thanks

Preston-3Author

Level 4

@AmitVishwakarma - Can you weigh in here? Intuitively it makes sense that the QueryRecorder count is referring to the number of nodes scanned for *this* result, but you’re saying the opposite of what @VishalKa5 is saying here.

@VishalKa5 - I think Publish environments are actually only going to have LESS data by their nature, right? Unless you’re referring to things that aren’t unpublished and then removed from Author. Author should always have more.

Hopefully the ticket with Adobe helps.

VishalKa5

Level 6

Hi @Preston-3,

Yes, you’re right — typically Author has more content than Publish, since unpublished or draft content exists only on Author. Thanks for pointing that out.

My earlier point was mainly that the QueryRecorder count represents the number of nodes Oak scanned during query execution, not the number of results returned. Even if the final result is one record, the query engine may still scan multiple indexed entries depending on the index structure and filters.

Differences between local and cloud environments can also happen due to index state, reindexing, or dataset differences, which might explain why the scan count looks higher in the logs.

Agree that the Adobe support ticket should give more clarity, especially since we cannot access Query Performance tools on Cloud Publish.

Thanks,
Vishal

kautuk_sahni

Adobe Employee

@Preston-3 Following up to see if this has been resolved. If you’ve found the answer—whether through the suggestions here or through your own troubleshooting—feel free to post the solution so others can benefit. If any reply helped along the way, marking it as accepted ensures the most useful information is easy to find. Thanks for helping close the loop!

Kautuk Sahni

PGURUKRISHNA

Level 5

Hey @Preston-3

This is a common source of confusion with AEM's Query Recorder vs. the Query Performance tool, they measure different things.

Here's what's likely happening:

Query Recorder "count" ≠ rows scanned per execution

The

QueryRecorder count: 2400

is not the number of nodes scanned in a single query execution. The Query Recorder aggregates over time — it tracks how many times that query (or its pattern) has been executed across a sampling window. So

count: 2400

likely means that query pattern was executed ~2,400 times in the recording interval, not that it scanned 2,400 nodes in one call.

This is why when you run it locally in the Query Performance tool (which measures a single execution), you see only 1 row scanned — that's the actual per-execution cost, and it's fine.

Why it shows up in the logs at all

The Query Recorder logs queries that exceed a configurable threshold of total cumulative cost (executions × cost). A query that scans 1 node but runs 2,400 times in a short window will still appear. On a Publish tier under real traffic, popular persisted queries get called frequently, which inflates the count.

Things to verify

Index is healthy on Publish: Your query uses
```
OPTION (INDEX TAG[contentFragments], TRAVERSAL FAIL)
```
— the
```
TRAVERSAL FAIL
```
means it will fail rather than fall back to traversal, which is good. If it's returning results, the index is being used. But index definitions can differ between Author and Publish, or the async index might be lagging. Check your
```
oak:index
```
definitions are deployed consistently.
The
```
ISDESCENDANTNODE
```
appears twice: The generated SQL2 has a redundant descendant node constraint. This is normal for GraphQL-generated queries in AEM (the framework adds both the path filter and the content fragment path restriction). It shouldn't cause performance issues if the index covers it, but it's worth noting.
Dispatcher/CDN caching: If this query is being called 2,400 times in a short window, you may want to ensure your persisted query responses are being cached at the Dispatcher or CDN layer. Persisted queries support GET requests, which are cacheable — this would dramatically reduce the load on Publish.
```
dam:IndexedFragmentData
```
: This node type is used by the Content Fragment index. Make sure the
```
/oak:index/damAssetLucene
```
(or your custom CF index) has the
```
serviceCode
```
property indexed. If it's not in the index definition, Oak would need to do property-level filtering post-index, which could increase scan cost.

Recommendation

Your query itself is likely fine (1 node scanned per execution confirms this). The Query Recorder entry is a volume indicator, not a performance red flag per-se. Focus on:

Caching persisted query responses at Dispatcher/CDN
Confirming the index covers all filtered properties (
```
@string@model
```
,
```
string@serviceCode
```
)
Monitoring via Cloud Manager's log tailing or New Relic for actual slow query times, not just Query Recorder counts

The Adobe support ticket is the right call for getting Publish-side index and query performance data you can't access directly in Cloud environments.

Pagidala GuruKrishna

Query Recorder "count" ≠ rows scanned per execution

Why it shows up in the logs at all

Things to verify

Recommendation

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded