Expand my Community achievements bar.

SOLVED

HTML/JSON extraction (in bulk) option

Avatar

Level 4
Level 4

Hello Community members, 

 

We have a requirement where the end application owner only requires HTML/JSON extraction for the all available site pages(in bulk, based on modified date) from AEM.

We are looking to explore any built-in options or solutions that would allow us to provide this extraction. 

 

thank you in advance for you time, your insights and suggestions on how to achieve this would be greatly appreciated. 

 

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

Hi @nj2 
You can achieve this OOTB using

1. Create an servlet which will expose the json that includes only the page paths.

2. Retrieve json of the page, one by one for the 1st calls response pages. this could be done by model.json selector.

 

I would not suggest the bulk operations here, because of :

1. Creating very high load on publish

2. 504 Gateway errors due to timeout, Unexpected results.

3. caching can't be achieve at the dispatcher/cdn/consumer side.

 



Arun Patidar

View solution in original post

2 Replies

Avatar

Correct answer by
Community Advisor

Hi @nj2 
You can achieve this OOTB using

1. Create an servlet which will expose the json that includes only the page paths.

2. Retrieve json of the page, one by one for the 1st calls response pages. this could be done by model.json selector.

 

I would not suggest the bulk operations here, because of :

1. Creating very high load on publish

2. 504 Gateway errors due to timeout, Unexpected results.

3. caching can't be achieve at the dispatcher/cdn/consumer side.

 



Arun Patidar

Avatar

Community Advisor

Hey, 

If you're looking for a unconventional approach:

  • Explore the Dispatcher cache.
    • Obtain cached HTML files that can be bundled.
    • Note: If you have multiple web servers, you'll need to gather and package HTML from all of them.

Note: Raise a ticket and check with Adobe in case of Cloud environment

  • Consider using the List core component.
    • Configure the root path and child depth levels.
    • Generate a list of all available page URLs.
    • Invoke the retrieval of HTML for each page.
  • List Component | Adobe Experience Manager