Access or export old versions of sites content to file | Community
Skip to main content
May 12, 2021
Solved

Access or export old versions of sites content to file

  • May 12, 2021
  • 2 replies
  • 1671 views

Hello AEM team,

 

I have a collection of policies (thousands) published through AEM. Each of these policies has as many as 10 older versions of the current one.

I am in need of extracting the older versions text content due to litigation. Do you know of a way to export the content from older versions as text or HTML or PDF files?

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by Vaibhavi_J

Hi @clintg6 , 

As you have more than 1000 policies, do not suggest manual extraction. 

You can extract with a simple custom solution.

  • Fetch the older version of nodes using path and jcr:created identifier.
  • Once you get the list of node paths, appending path with .infinity.json should extract the content. /content/nodeName/jcr_content.infinity.json
  • Copy the required content to any text document. 

2 replies

ibishika
Level 4
May 13, 2021

Although it will depend on what you want to do with the extracted content as html, but you can get the content as xml files by packaging up from the package manager or by pulling them into your projects content folder using some IDE plugin and then convert the extracted xml files to html.

Vaibhavi_J
Vaibhavi_JAccepted solution
Level 7
May 13, 2021

Hi @clintg6 , 

As you have more than 1000 policies, do not suggest manual extraction. 

You can extract with a simple custom solution.

  • Fetch the older version of nodes using path and jcr:created identifier.
  • Once you get the list of node paths, appending path with .infinity.json should extract the content. /content/nodeName/jcr_content.infinity.json
  • Copy the required content to any text document. 
clintg6Author
May 13, 2021
Thanks for the help Vaibhavi. What if I also wanted to save the old version as an HTML or PDF file. How would I do that in addition to the text extraction?