Your achievements

Level 1

0% to

Level 2

Tip /
Sign in

Sign in to Community

to gain points, level up, and earn exciting badges like the new
BedrockMission!

Learn more

View all

Sign in to view all badges

Content Archival strategy

Avatar

Avatar
Springboard
Community Advisor
Nirmal_Jose
Community Advisor

Likes

124 likes

Total Posts

207 posts

Correct reply

58 solutions
Top badges earned
Springboard
Ignite 5
Ignite 3
Give Back 100
Give Back 50
View profile

Avatar
Springboard
Community Advisor
Nirmal_Jose
Community Advisor

Likes

124 likes

Total Posts

207 posts

Correct reply

58 solutions
Top badges earned
Springboard
Ignite 5
Ignite 3
Give Back 100
Give Back 50
View profile
Nirmal_Jose
Community Advisor

30-09-2020

As different client has different data archival standards, we need to create archival process for various clients. In AEM, since the content goes into the repository, anything that comes inside the repo becomes a data that is moved across environments. Moreover, we need to have a process for content editors for moving archived content to some other tree or space and to keeping the content tree clean and only with needed content.

 

It will be great if you can support an archival strategy OOTB. The features I look for in this feature is

1. Easy to manage from authoring experience

2. Reduced to a small size

3. It should be outside the repository so that we don't need to move that as content in the repo.

4. We should be able to search through the metadata

5. We should be able to retrieve the content with some action.

6 Comments

Avatar

Avatar
Coach
Employee
Jörg_Hoh
Employee

Likes

1,243 likes

Total Posts

3,230 posts

Correct reply

1,109 solutions
Top badges earned
Coach
Give back 600
Ignite 5
Ignite 3
Ignite 1
View profile

Avatar
Coach
Employee
Jörg_Hoh
Employee

Likes

1,243 likes

Total Posts

3,230 posts

Correct reply

1,109 solutions
Top badges earned
Coach
Give back 600
Ignite 5
Ignite 3
Ignite 1
View profile
Jörg_Hoh
Employee

30-09-2020

What are your requirements regarding archival? If these are legal requirements, you can an archival solution which satisfies all these legal requirements.

 

I typically recommend this archival strategy for pages: Together with the publishing process you create a PDF/A of the published page, which contains all relevant metadata as well. Store this in your archival solution.

 The archival solution must then provide the necessary features like retrieval, search for metadata, audit trail, etc.

 

Jörg

Avatar

Avatar
Springboard
Community Advisor
Nirmal_Jose
Community Advisor

Likes

124 likes

Total Posts

207 posts

Correct reply

58 solutions
Top badges earned
Springboard
Ignite 5
Ignite 3
Give Back 100
Give Back 50
View profile

Avatar
Springboard
Community Advisor
Nirmal_Jose
Community Advisor

Likes

124 likes

Total Posts

207 posts

Correct reply

58 solutions
Top badges earned
Springboard
Ignite 5
Ignite 3
Give Back 100
Give Back 50
View profile
Nirmal_Jose
Community Advisor

01-10-2020

Thanks @Jörg_Hoh for the feedback. Yes, agreed generating a PDF and store will help for the archival for legal compliance.

 

One of the clients requirement are as follows.

1. They have IaC standards and they recreate environments from prod aws backup snapshots.

2. Since the whole repo is in the snapshot, they want it to be lean as possible.

3. They have tight content workflows were pages are created, activated, deactivated and deleted on schedule.

4. But, they really don't want to delete a page after deactivation and still want it searchable.

5. They want to search through the archives and bring a page back if needed.

 

I have a content packaging solution after moving into an archival tree and getting this package into a special archive environment. But, looking forward for a more standardised approach.

Avatar

Avatar
Coach
Employee
Jörg_Hoh
Employee

Likes

1,243 likes

Total Posts

3,230 posts

Correct reply

1,109 solutions
Top badges earned
Coach
Give back 600
Ignite 5
Ignite 3
Ignite 1
View profile

Avatar
Coach
Employee
Jörg_Hoh
Employee

Likes

1,243 likes

Total Posts

3,230 posts

Correct reply

1,109 solutions
Top badges earned
Coach
Give back 600
Ignite 5
Ignite 3
Ignite 1
View profile
Jörg_Hoh
Employee

01-10-2020

Archival is a tough thing, also because it's so diverse.

 

First of all, from my point of view, archival is not backup. You archive things which are finished and closed, and you need to keep them around for mostly legal reasons. Just storing files on a fileshare is not an option, because everyone can read and write on them (you cannot check for integrity), you don't have an audit trail, and you can not delete these records on time (because you must not keep them longer than required).

For this requirement I always recommend PDFs and a dedicated archival solution. Because AEM is not an application for archival.

 

When you want to have "old" content available and searchable, I don't see any other chance than to retain it in an archive folder, remove write access for most/all users and try to reduce the number of versions for them (you don't need them). But be sure, that you understand the drawbacks of that:

  • no one can guarantee the integrity of these unpublished pages.
  • They create some overhead, as they consist of JCR nodes and that increases the size of some indexes
  • You will always have them with you, you cannot externalize it.
  • You always have to consider them when your application is evolving. That means you either consider them with every style and component change, or you find a different solution for it. What about the assets referenced in these pages? And what if the rendering components/scripts require certain OSGI services/components in a certain version?

Especially the last item can be a longterm burden your development velocity, the more old stuff you have the more you need to invest in it.

 

Some suggestions how you can improve it (of course everything is customization and not available ootb):

  • Think about you can reduce the overhead of these pages, and potentially even transform into something which can standalone and does not have any dependency to the application itself. For example a PDF. You should still be able to find all text in the fulltext index.
  • Then you could reconfigure the ootb indexes not to cover your archive area anymore, but instead feed all these data into an external search engine, and let the authors search the archive only there.

And there are probably a ton more possibilities.

 

The most important thing you should consider is the impact of an "in-repo" archive to your application velocity. As long as you maintain this old content as pages, you have to test it.

 

HTH,

Jörg

Avatar

Avatar
Springboard
Community Advisor
Nirmal_Jose
Community Advisor

Likes

124 likes

Total Posts

207 posts

Correct reply

58 solutions
Top badges earned
Springboard
Ignite 5
Ignite 3
Give Back 100
Give Back 50
View profile

Avatar
Springboard
Community Advisor
Nirmal_Jose
Community Advisor

Likes

124 likes

Total Posts

207 posts

Correct reply

58 solutions
Top badges earned
Springboard
Ignite 5
Ignite 3
Give Back 100
Give Back 50
View profile
Nirmal_Jose
Community Advisor

07-10-2020

make sense @Jörg_Hoh . I can see two more ideas came up which is better than mine and can fulfill my requirements - trash bin and jcr:versionhistory offloading.

 

As you said, archival is better placed to be outside and not retrievable back as a page - a PDF copy.

Avatar

Avatar
Give Back 200
Employee
hamidk92094312
Employee

Likes

104 likes

Total Posts

240 posts

Correct reply

38 solutions
Top badges earned
Give Back 200
Give Back 100
Contributor
Shape 1
Ignite 1
View profile

Avatar
Give Back 200
Employee
hamidk92094312
Employee

Likes

104 likes

Total Posts

240 posts

Correct reply

38 solutions
Top badges earned
Give Back 200
Give Back 100
Contributor
Shape 1
Ignite 1
View profile
hamidk92094312
Employee

22-12-2020

Hi @Nirmal_Jose 

Ref: jcr:versionhistory offloading

I will mark this as duplicate for now since the same request has already been raised.

https://experienceleaguecommunities.adobe.com/t5/adobe-experience-manager-ideas/aem-version-offloadi...

Status changed to: Duplicate

Avatar

Avatar
Springboard
Community Advisor
Nirmal_Jose
Community Advisor

Likes

124 likes

Total Posts

207 posts

Correct reply

58 solutions
Top badges earned
Springboard
Ignite 5
Ignite 3
Give Back 100
Give Back 50
View profile

Avatar
Springboard
Community Advisor
Nirmal_Jose
Community Advisor

Likes

124 likes

Total Posts

207 posts

Correct reply

58 solutions
Top badges earned
Springboard
Ignite 5
Ignite 3
Give Back 100
Give Back 50
View profile
Nirmal_Jose
Community Advisor

22-12-2020

Yes, thanks @hamidk92094312