fetching/crawling data from AEM
Hi team,
we have a requirement where we need to fetch/crawl entire data from AEM and ingest into our project.
what should be the best approach for this and also can you share some useful links/videos
Thank you,
Sriram
Hi team,
we have a requirement where we need to fetch/crawl entire data from AEM and ingest into our project.
what should be the best approach for this and also can you share some useful links/videos
Thank you,
Sriram
Hi @sriram_1 , As you did not mention, what are you trying to achieve using this data. If you are trying to achieve search, Probabilty you should go for third party search like solar search. But for sake of answer.
There are two ways you can do it.
1. Iterate pages/assets/users and prepared result.
2. Use query either query builder to SQL2. Write service to execute query in service code and get result.
I am sharing example of both. But first you have to get Resource resolver as below.
1. Giving some example code for iteration. Make sure your Resource Resolver object has proper permission to access required data/content/pages/users
Getting pages
Page page = resourceResolver.adaptTo(PageManager.class).getPage("/content");
Iterator<Page> childPages = page.listChildren(null,true);
while (childPages.hasNext()) {
Page childPage = childPages.next();
}Getting User and Groups. Printing in logs. You can use as per your need.
ResourceResolver resourceResolver = ResolverUtil.newResolver(resourceResolverFactory);
Session session = resourceResolver.adaptTo(Session.class);
UserManager userManager = ((JackrabbitSession) session).getUserManager();
Iterator<Authorizable> userIterator = userManager.findAuthorizables("jcr:primaryType", "rep:User");
LOG.info("\n ----------GETTING USERS-------------");
while (userIterator.hasNext()) {
Authorizable user = userIterator.next();
LOG.info("\n User : {}", user.getPath());
}
Iterator<Authorizable> systemUserIterator = userManager.findAuthorizables("jcr:primaryType", "rep:SystemUser");
LOG.info("\n ----------GETTING System USERS-------------");
while (systemUserIterator.hasNext()) {
Authorizable serviceUser = systemUserIterator.next();
LOG.info("\n Service User : {}", serviceUser.getPath());
}
Iterator<Authorizable> groupIterator = userManager.findAuthorizables("jcr:primaryType", "rep:Group");
LOG.info("\n ----------GETTING Groups-------------");
while (groupIterator.hasNext()) {
Authorizable group = groupIterator.next();
LOG.info("\n Group : {}", group.getPath());
}2. Sharing some sample queries and code implementations.
Query Builder query to get page and assets. I am sharing simplest one. create as per your need.
/* ---To get Assets----*/ path=/content/dam type=dam:Asset p.limit=-1 /* ---To get Pages----*/ /* ---Adjust type as per your content----*/ path=/content type=cq:PageContent p.limit=-1
How to implement in backend
@Reference
QueryBuilder queryBuilder;
Map<String,String> queryMap=new HashMap<>();
queryMap.put("path","/content/dam/we-retail");
queryMap.put("type","dam:Asset");
queryMap.put("p.limit",Long.toString(-1));
final Session session = resourceResolver.adaptTo(Session.class);
Query query = queryBuilder.createQuery(PredicateGroup.create(queryMap), session);
SearchResult result = query.getResult();
int perPageResults = result.getHits().size();
long totalResults = result.getTotalMatches();
List<Hit> hits =result.getHits();
for(Hit hit: hits){
Asset asset=hit.getResource().adaptTo(Asset.class);
LOG.info("\n Page {} ",asset.getPath());
}
In Case you use SQL 2
String searchPath="/content/we-retail";
String sql2Query = "SELECT * FROM [cq:PageContent] AS node WHERE ISDESCENDANTNODE ("+searchPath+") ORDER BY node.[jcr:title]";
ResourceResolver resourceResolver = ResolverUtil.newResolver(resourceResolverFactory);
final Session session = resourceResolver.adaptTo(Session.class);
final javax.jcr.query.Query query = session.getWorkspace().getQueryManager().createQuery(sql2Query,javax.jcr.query.Query.JCR_SQL2);
final QueryResult result = query.execute();
NodeIterator pages=result.getNodes();
JSONArray resultArray=new JSONArray();
while(pages.hasNext()){
Node page=pages.nextNode();
}These are just sample codes. Get Resource Resolver with proper permissions.
Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.