Expand my Community achievements bar.

Enhance your AEM Assets & Boost Your Development: [AEM Gems | June 19, 2024] Improving the Developer Experience with New APIs and Events

AEM Page Version Offloading

Avatar

Level 2

Hi,

In my current project  AEM page version history is retained and never deleted for audit and records management purpose. However this has led to the size of repository increasing gradually.

I am exploring if there is a way I can offload only AEM Page versions (not whole repo backup) out of AEM and when needed bring those versions back into repository.

For example: my current site has last 4 years of page versions. I want to offload page versions till 2017 out of AEM to make repository lean and if needed to comparison or restoration to old versions, I can bring back offloaded versions back into AEM.

I know its a one off scenario but just wanted to check on this forum if I could get some feedback or alternate solutions.

Thanks,

Pankaj

5 Replies

Avatar

Employee

One way to deal with this scenario would be to create a fresh instance running same aem version. Then use crx2oak tool to migrate the Page content using --include-paths=/content/geometrixx argument and the --copy-versions=true flag which would make a backup instance with all versions and pages.

You can then run a weekly/monthly job to merge the pages/version using crx2oak tool and make the backup instance.

Avatar

Level 2

Thanks Kunwar for details,

I did a simple setup to check the approach but after copying content tree AEM is throwing error on startup , following are details

  1. Setup a fresh AEM 6.3 Source Instance.
    1. Go to /content/we-retail/us/en/men.html page and create couple of versions
  2. Setup a fresh AEM 6.3 destination instance
  3. copy /content/we-retail subtree from source to destination using crx2oak, with following command
    1. java -jar crx2oak-1.8.4-all-in-one.jar <source>/crx-quickstart/repository <destination>/crx-quickstart/repository --include-paths=/content/we-retail --copy-versions=true
    2. wait for command to complete
  4. Start destination AEM instance.
    1. AEM instance throwing 503 Error
    2. Error Log shows following error repeatedly

10.05.2018 11:48:51.164 *ERROR* [qtp32225747-166] org.apache.sling.engine.impl.SlingHttpContext handleSecurity: AuthenticationSupport service missing. Cannot authenticate request.

10.05.2018 11:48:51.165 *ERROR* [qtp32225747-166] org.apache.sling.engine.impl.SlingHttpContext handleSecurity: Possible reason is missing Repository service. Check AuthenticationSupport dependencies.

10.05.2018 11:48:53.165 *ERROR* [qtp32225747-167] org.apache.sling.engine.impl.SlingHttpContext handleSecurity: AuthenticationSupport service missing. Cannot authenticate request.

10.05.2018 11:48:53.165 *ERROR* [qtp32225747-167] org.apache.sling.engine.impl.SlingHttpContext handleSecurity: Possible reason is missing Repository service. Check AuthenticationSupport dependencies.

Am I missing something in above steps, Please suggest.

Many Thanks,

Pankaj

Avatar

Employee

First you should use 1.6.x version of crx2oak tool

Avatar

Level 2

Thank you Kunwar, for highlighting the wrong version..

After using 1.6.x version of crx2oak I was able to copy the content tree from source to destination.

and I tried the exact scenario where I want to purge old versions in source and retain all versions in backup

  1. In source AEM create 5 versions of a page under /content/we-retail.
  2. Copy sub-tree /content/we-retail using crx2oak to from source to destination. Source and destination both have 5 versions of page as below.

          java -jar crx2oak-1.6.8-all-in-one.jar <source>/crx-quickstart/repository <destination>/crx-quickstart/repository --include-paths=/content/we-retail --copy-versions=true

         

     1484909_pastedImage_1.png

  1. Purge first 3 versions in source [for a scenario where old versions are purged in active instance while retained in backup instance]
  2. create 2 new versions in source after purge. Source version history of page looks as following now.

      1484911_pastedImage_6.png

  1. Copy sub-tree /content/we-retail using crx2oak from source to destination using same command
  2. start destination instance and check page version, version 1 to 3 are not present in destination, it seems crx2oak is not retaining page versions in destination

      1484919_pastedImage_9.png

And the objective to incrementally add new page versions in backup instance while retaining old versions is not being achieved.

is there a different command or parameter to incrementally adding versions?

Note: I tried using --merge-paths=/content/we-retail parameter as well but to no effect.

Would merge-paths only work at content level like merging new pages but will not work for merging different versions of same content page.

Avatar

Employee

I don't think there is an available flag to migrate incremental versions in the version storage. But you can add a custom hook to the tool and make it incremental. See how you can add a custom hook (1).

Jackrabbit Oak – Repository migration