Expand my Community achievements bar.

Guidelines for the Responsible Use of Generative AI in the Experience Cloud Community.

What's the proper way to query all the users of an AEM installation.

Avatar

Level 2

Hello,

Do you guys know what's the proper way (if exists) to query all the users in the system (under /home/users)?

We have a site for one of our clients in which in one component they need to select a user from the system, it can be any user. Currently we use the following query to retrieve the users based on a search term ("%s").

SELECT * FROM [nt:unstructured] AS p WHERE (ISDESCENDANTNODE([/home/users]) AND (CONTAINS(p.email, \"*%s*\") OR CONTAINS(p.givenName, \"*%s*\") OR CONTAINS(p.familyName, \"*%s*\") OR CONTAINS(p.preferredName, \"*%s*\")) AND [email] IS NOT NULL AND [id] IS NOT NULL)

There are a lot of users in the system, we use SAML so the profiles have a lot of custom properties. Using this query and an admin session is pretty much what we do to retrieve the users.

Currently we're experiencing some performance issues and the query is starting to take more and more time.

Is there like an out of the box feature for AEM to do this?

Any help is greatly appreciated.

Thank you so much.

4 Replies

Avatar

Level 10

Your approach is valid. Using JCR SQL2. We had written tools in past and you are correct. Depending on the number of user nodes, it May take time to. process the result set.

Avatar

Level 2

Hi,

Thanks for your response.

Is there another way to optimize this (besides adding the required OAK indexes) ?

Avatar

Level 10

Per my knowledge, SQL2 query optimization follows more or less similar best practices as SQL. You may search for 'SQL tuning best practices' and implement the same.

This query is already using "CONTAINS" and specific node (nt:unstructured) to be searched which is good.

Couple of things may be revisited (and tested) accordingly:

-Using select fields rather than select *

-avoiding wild cards in the front

-use UNION than ORs

-use inner joins

https://www.sisense.com/blog/8-ways-fine-tune-sql-queries-production-databases/

If you use OOB Query Performance tool (http://localhost:4502/libs/granite/operations/content/diagnosistools/queryPerformance.html), it would provide you with optimized query. In your case, the optimized query follows same SQL standards.

The cost can be worked upon using indexes.

Attempting optimisation

Alternative query available:

select [p].[jcr:primaryType] as [p.jcr:primaryType] from [nt:unstructured] as [p] where (contains([p].[familyName], '*%s*')) and (isdescendantnode([p], [/home/users])) and ([p].[email] is not null) and ([p].[id] is not null) union

select [p].[jcr:primaryType] as [p.jcr:primaryType] from [nt:unstructured] as [p] where (contains([p].[email], '*%s*')) and (isdescendantnode([p], [/home/users])) and ([p].[email] is not null) and ([p].[id] is not null) union

select [p].[jcr:primaryType] as [p.jcr:primaryType] from [nt:unstructured] as [p] where (contains([p].[preferredName], '*%s*')) and (isdescendantnode([p], [/home/users])) and ([p].[email] is not null) and ([p].[id] is not null) union

select [p].[jcr:primaryType] as [p.jcr:primaryType] from [nt:unstructured] as [p] where (contains([p].[givenName], '*%s*')) and (isdescendantnode([p], [/home/users])) and ([p].[email] is not null) and ([p].[id] is not null)

1707760_pastedImage_11.png

Avatar

Employee Advisor

If you need to have the complete list of users quite often:

* use a query and fill the result into an in-memory cache

* when something below /home/users changes, either invalidate the cache completely or just re-read the change user.

Jörg