Expand my Community achievements bar.

SOLVED

AEM 6.5 - Reducing size/ clean up for 'repository/segmentstore'

Avatar

Level 2

Hie Adobe Community,

 

Adobe services are configured in form of Author and Publish setup in our cluster, we are in process of upgrading the AEM 6.5 to later service patch versions. 

On back of data clean up for preparing the cluster ready for upgrade/ backup, we noticed the segment store in publish instances are heavily consumed at storage levels (mostly because of 'data*.tar.bak' ).

 

/aem/crx-quickstart/repository # du -sh  *

748.0K blobids

6.6G datastore

425.4M index

152.2G segmentstore

 

We are looking for safe options to clean up the bak files and configure auto clean of backup files after regular intervals. Please do advise over same.

Thanks 

Pavan

1 Accepted Solution

Avatar

Correct answer by
Level 10

Hi @Pavan_KumarTi ,

To clean up and reduce the size of the 'segmentstore' in AEM 6.5, particularly focusing on the 'data*.tar.bak' files, you can follow these steps. This will help you safely manage and clean up the repository before your upgrade:

Manual Cleanup

  1. Stop the AEM Instance: Before making any changes to the repository files, ensure the AEM instance is stopped to prevent data corruption.
    ./aem stop

    1. Backup: Before performing any cleanup, ensure you have a full backup of your repository and any other critical data.

    2. Run Offline Compaction: Run offline compaction to clean up the segment store. This process will remove old and unused segments.
      java -jar aem-quickstart.jar -v -r compaction
      Delete Old Backup Files: After running offline compaction, you can manually delete the 'data*.tar.bak' files. These are backup files created before compaction and are typically safe to remove after a successful compaction.

      Automatic Cleanup Configuration

      To configure automatic cleanup of old backup files and manage repository size, you can use the following OSGi configurations and repository maintenance tasks:

      1. Configure TarMK Compaction: In the AEM Web Console, configure the TarMK compaction to run automatically and clean up old segments:

        • Go to AEM Web Console: http://localhost:4502/system/console/configMgr
        • Search for and configure "Apache Jackrabbit Oak Segment Tar Compaction":
          • Set "Compaction Mode" to custom
          • Configure other parameters such as "Compaction Retained Generations", "Compaction Force Timeout", etc., according to your needs.
      2. Enable Revision Cleanup: Configure automatic revision cleanup which includes compaction and other maintenance tasks:

        • Go to AEM Web Console: http://localhost:4502/system/console/configMgr
        • Search for and configure "Apache Jackrabbit Oak Revision GC":
          • Set "GC Type" to tar
          • Schedule the maintenance task according to your requirements (e.g., daily, weekly).
      3. Configure Blob Garbage Collection: To clean up unused blobs in the data store:

      Example Configuration Steps

      Here are some example steps to configure the automatic cleanup in AEM:

      1. Access AEM Web Console: Open your browser and go to http://<AEM-HOST>:<PORT>/system/console/configMgr.

      2. Configure TarMK Compaction:

        • Search for "Apache Jackrabbit Oak Segment Tar Compaction".
        • Set "Compaction Mode" to custom.
        • Configure the schedule (e.g., 0 0 2 * * ? for daily at 2 AM).
      3. Configure Revision GC:

        • Search for "Apache Jackrabbit Oak Revision GC".
        • Set "GC Type" to tar.
        • Schedule the GC process (e.g., 0 0 3 * * ? for daily at 3 AM).
      4. Configure Blob GC:

        • Search for "Apache Jackrabbit Oak Blob Garbage Collection".
        • Enable and schedule the process according to your needs.

      Important Notes

      • Test in Non-Production: Always test these configurations and clean-up processes in a non-production environment before applying them to production.
      • Regular Monitoring: Regularly monitor the segment store size and the success of cleanup tasks. Ensure the tasks are running as expected.
      • Documentation and Logs: Keep documentation of your configurations and review AEM logs to troubleshoot any issues related to repository maintenance.

      By following these steps, you can effectively manage and reduce the size of your 'segmentstore' in AEM 6.5, ensuring a smoother upgrade and backup process.





View solution in original post

3 Replies

Avatar

Correct answer by
Level 10

Hi @Pavan_KumarTi ,

To clean up and reduce the size of the 'segmentstore' in AEM 6.5, particularly focusing on the 'data*.tar.bak' files, you can follow these steps. This will help you safely manage and clean up the repository before your upgrade:

Manual Cleanup

  1. Stop the AEM Instance: Before making any changes to the repository files, ensure the AEM instance is stopped to prevent data corruption.
    ./aem stop

    1. Backup: Before performing any cleanup, ensure you have a full backup of your repository and any other critical data.

    2. Run Offline Compaction: Run offline compaction to clean up the segment store. This process will remove old and unused segments.
      java -jar aem-quickstart.jar -v -r compaction
      Delete Old Backup Files: After running offline compaction, you can manually delete the 'data*.tar.bak' files. These are backup files created before compaction and are typically safe to remove after a successful compaction.

      Automatic Cleanup Configuration

      To configure automatic cleanup of old backup files and manage repository size, you can use the following OSGi configurations and repository maintenance tasks:

      1. Configure TarMK Compaction: In the AEM Web Console, configure the TarMK compaction to run automatically and clean up old segments:

        • Go to AEM Web Console: http://localhost:4502/system/console/configMgr
        • Search for and configure "Apache Jackrabbit Oak Segment Tar Compaction":
          • Set "Compaction Mode" to custom
          • Configure other parameters such as "Compaction Retained Generations", "Compaction Force Timeout", etc., according to your needs.
      2. Enable Revision Cleanup: Configure automatic revision cleanup which includes compaction and other maintenance tasks:

        • Go to AEM Web Console: http://localhost:4502/system/console/configMgr
        • Search for and configure "Apache Jackrabbit Oak Revision GC":
          • Set "GC Type" to tar
          • Schedule the maintenance task according to your requirements (e.g., daily, weekly).
      3. Configure Blob Garbage Collection: To clean up unused blobs in the data store:

      Example Configuration Steps

      Here are some example steps to configure the automatic cleanup in AEM:

      1. Access AEM Web Console: Open your browser and go to http://<AEM-HOST>:<PORT>/system/console/configMgr.

      2. Configure TarMK Compaction:

        • Search for "Apache Jackrabbit Oak Segment Tar Compaction".
        • Set "Compaction Mode" to custom.
        • Configure the schedule (e.g., 0 0 2 * * ? for daily at 2 AM).
      3. Configure Revision GC:

        • Search for "Apache Jackrabbit Oak Revision GC".
        • Set "GC Type" to tar.
        • Schedule the GC process (e.g., 0 0 3 * * ? for daily at 3 AM).
      4. Configure Blob GC:

        • Search for "Apache Jackrabbit Oak Blob Garbage Collection".
        • Enable and schedule the process according to your needs.

      Important Notes

      • Test in Non-Production: Always test these configurations and clean-up processes in a non-production environment before applying them to production.
      • Regular Monitoring: Regularly monitor the segment store size and the success of cleanup tasks. Ensure the tasks are running as expected.
      • Documentation and Logs: Keep documentation of your configurations and review AEM logs to troubleshoot any issues related to repository maintenance.

      By following these steps, you can effectively manage and reduce the size of your 'segmentstore' in AEM 6.5, ensuring a smoother upgrade and backup process.





Avatar

Community Advisor

@Pavan_KumarTi as mentioned by @HrishikeshKa segmentstore Compaction is only option, in this we have two options, 1. Online Compaction 2. Offline Compaction.

 

Please check adobe documentation for details.

https://experienceleague.adobe.com/en/docs/experience-manager-65/content/implementing/deploying/depl...

Avatar

Level 2

Hi Shashi,

Thanks for sharing update, we have configurd online compaction which is runnign as expected weekly. Yet we are ended up with these large volumes of files 'data*.tar.bak in our segment store.

Does these backup files are safe for me to remove straight away, or should i need to run the offline compaction before removing them ?