Expand my Community achievements bar.

Radically easy to access on brand approved content for distribution and omnichannel performant delivery. AEM Assets Content Hub and Dynamic Media with OpenAPI capabilities is now GA.
SOLVED

Dangers of rotating tar files too often

Avatar

Level 4

In our Stage environment, we have a 'Share Nothing' Author configuration with a diskspace of 200GB for the Author. The diskspace fills up too quickly; every day workspace/crx.default directory grows by 30GB and tarJournal by 12G. 

When tar optimisation runs overnight, it optimizes the tar files but doesn't delete them if the maximumAge is not set in the repository.xml. By default optimized files are not deleted unless they are a month old. 

The problem is since these optimized files are not deleted, when optimisation runs the following night, it starts by processing the oldest file under the 3 locations: workspace/crx.default, tarJournal, version. It takes the same amount of time to optimize these files, although they had already been optimised the night before. Since they're already optimised, the append=-1 shows there is nothing to append to the new output data files created during optimisation. The optimisation process really should start from the where it left off, rather from the oldest file in the directory. (All sites mention optimisation starts off from where it previously stopped.) Correct me if I'm wrong, but this is a bug in CQ.

My question is could we safely have the following setting where maximumAge is set to 12H?
    <Cluster>    
        <Journal class="com.day.crx.persistence.tar.TarJournal">
            <param name="maximumAge" value="PT12H"/>
        </Journal>
    </Cluster>

I need answers to the following questions if you could please? Would really appreciate your feedback.

1) The optimisation process starts from the oldest file rather than starting off from where it left off during the previous Tar optimisation run. Is this a Bug? 

2) This means when files are processed and optimised, the files are deleted if they are greater than 12hours in age. If the files are processed and optimised and appended to output data files created during optimisation, what is the motivation of not deleting these already optimised files? 

3) What are the dangers of rotating files and not leaving them there for say a month? When we run migration, these about 170 tar files are created under workspace/crx.default. We would like them deleted as soon as they are optimised and appended to new data files. But I am not aware of the consequences (if there are any) of deleting already optimised files.

 

How to Reproduce:

Easy to reproduce. In your CQ environment, leave the default setting for repository.xml:
<Journal class="com.day.crx.persistence.tar.TarJournal"/>

Make sure you have some tar files under 
1) /repository/tarJournal
2) /repository/workspaces/crx.default
3) /repository/version

Run Tar Optimisation process from http://localhost:4502/system/console/jmx/com.adobe.granite%3Atype%3DRepository, startTarOptimization(). As you observe it in the logs, you will see data tar files optimized, and appended to new output data files. When this process completes new data files are created. The processed files are not deleted.

Now re-run Tar optimization. You'll notice in the logs that it starts by processing the oldest file in the directories and you'll see append=-1 since there is nothing to append. But the time taken to process this tar file is no different. This wastes time re-processing already processed files and the tar optimisation process might never get to the newer files if there's a huge backlog.

Now set your tar files to be rotated:
    <Cluster>    
        <Journal class="com.day.crx.persistence.tar.TarJournal">
            <param name="maximumAge" value="PT5H"/>
        </Journal>
    </Cluster>
You'll observe that optimised tar files that are older than 5hours will be deleted. 

 

Business Impact: CQ on our Author servers shut down because of no diskspace as a result of the huge diskspace consumed by tar files. What are the dangers of rotating the tar files more often? Do we really need to keep already optimised tar files? 

Product version: AEM CQ 5.6.1

1 Accepted Solution

Avatar

Correct answer by
Level 10

IMO you are not in right direction,  Let me try to make basic right & then answer your questions with solutions.

Basic:-

1)    Watch the presentation at https://github.com/cqsupport/webinar-aem-monitoring-maintenance

2)   Optimization starts from where it left off and applies only to workspace and version.

3)    For tar journal blindly deletes files based on time stamp hence where it left off does not make sense here.

Answers to your Questions:-

1)    There is no bug & I will be surprised if so.   After understanding the basics if you still feel a bug, feel free to file a official request & support team will be happy to guide you.

2)     For tar journal as I said just deletes the file based on configured maximumAge.  Technically does not optimize & no point in deleting is age is lesser.

3)      If you are not using cluster environment you can disable tar journal as it does not have any impact Or

                you can make rotating files frequently with short maximumAge & manually run by creating optimize.tar only in tarjournal folder.   OR

           Have enough disk space, disable tar optimization. Once migration is complete Or during non peak hours manually run tar optimization.

Best Recommended approach

1)     Follow good migrarion strategy.   It is seperate topic

2)     Make sure more than 5 times disk space for content you are migrating.  (Technically speaking 2 times is sufficient & have seen many customer have bad plan to rerun some of workflow, etc causing to blow disk space )

3)     Remember one thing if you notice repository is growing without any content change then follow [1] to find cause & fix implemetation.   If repository is growing rapidy with consistently minimal content changes, then some thing you can improve with implementations. Some of common frequently mistakes from customers are [2]

 

[1]   http://helpx.adobe.com/experience-manager/kb/analyze-unusual-repository-growth.html

[2]

-    Improper SSO configuration with signon for each request

-    Mis configuration of reverse replication

-    Deploying migrated package which has all the rendition & was unnecessarly run on instance.

View solution in original post

6 Replies

Avatar

Correct answer by
Level 10

IMO you are not in right direction,  Let me try to make basic right & then answer your questions with solutions.

Basic:-

1)    Watch the presentation at https://github.com/cqsupport/webinar-aem-monitoring-maintenance

2)   Optimization starts from where it left off and applies only to workspace and version.

3)    For tar journal blindly deletes files based on time stamp hence where it left off does not make sense here.

Answers to your Questions:-

1)    There is no bug & I will be surprised if so.   After understanding the basics if you still feel a bug, feel free to file a official request & support team will be happy to guide you.

2)     For tar journal as I said just deletes the file based on configured maximumAge.  Technically does not optimize & no point in deleting is age is lesser.

3)      If you are not using cluster environment you can disable tar journal as it does not have any impact Or

                you can make rotating files frequently with short maximumAge & manually run by creating optimize.tar only in tarjournal folder.   OR

           Have enough disk space, disable tar optimization. Once migration is complete Or during non peak hours manually run tar optimization.

Best Recommended approach

1)     Follow good migrarion strategy.   It is seperate topic

2)     Make sure more than 5 times disk space for content you are migrating.  (Technically speaking 2 times is sufficient & have seen many customer have bad plan to rerun some of workflow, etc causing to blow disk space )

3)     Remember one thing if you notice repository is growing without any content change then follow [1] to find cause & fix implemetation.   If repository is growing rapidy with consistently minimal content changes, then some thing you can improve with implementations. Some of common frequently mistakes from customers are [2]

 

[1]   http://helpx.adobe.com/experience-manager/kb/analyze-unusual-repository-growth.html

[2]

-    Improper SSO configuration with signon for each request

-    Mis configuration of reverse replication

-    Deploying migrated package which has all the rendition & was unnecessarly run on instance.

Avatar

Level 4

Sham, thanks for taking the time to reply.

You say:  3) For tar journal blindly deletes files based on time stamp hence where it left off does not make sense here.

However, i have seen when I run tar Optimisation, the tar files under tarJournal and optimised and once done, deleted and a new output data file is created under tarJournal. See attachment. Files are optimised, appended to new files and if the maximumAge is provided, they are deleted. If not, the same files are optimised again, with append=-1.

So I'm dubious about disabling tarJournal. We're now just resorting to rotating these files every 24hours.

Thanks again for the other information and links you provided. We've noticed that as long as tar Optimisation process runs successfully overnight (and there are no processes blocking it, which slows it right down) disk space is recovered daily overnight.

Avatar

Level 10

anjali.biddanda wrote...

Sham, thanks for taking the time to reply.

You say:  3) For tar journal blindly deletes files based on time stamp hence where it left off does not make sense here.

However, i have seen when I run tar Optimisation, the tar files under tarJournal and optimised and once done, deleted and a new output data file is created under tarJournal. See attachment. Files are optimised, appended to new files and if the maximumAge is provided, they are deleted. If not, the same files are optimised again, with append=-1.

So I'm dubious about disabling tarJournal. We're now just resorting to rotating these files every 24hours.

Thanks again for the other information and links you provided. We've noticed that as long as tar Optimisation process runs successfully overnight (and there are no processes blocking it, which slows it right down) disk space is recovered daily overnight.

 

 hmmm...   Optimization of the tarJournal is almost a no-op, and will not affect performance.  In new output file did u see difference in file size?

Avatar

Level 4

Adobe sites mention "The journal helps maintains data consistency and helps the system to recover quickly from crashes. And  In a clustered environment the journal plays the critical role of synchronizing content across cluster instances" 

We don't have a clustered environment - but does that mean we can safely disable tarJournal? We are taking nightly content backups. If you've done it before, can you let me know how to disable tarJournal? 

Avatar

Level 10

anjali.biddanda wrote...

Adobe sites mention "The journal helps maintains data consistency and helps the system to recover quickly from crashes. And  In a clustered environment the journal plays the critical role of synchronizing content across cluster instances" 

We don't have a clustered environment - but does that mean we can safely disable tarJournal? We are taking nightly content backups. If you've done it before, can you let me know how to disable tarJournal? 

 

From xml file remove entire cluster[1] element

[1]

<Cluster>
<Journal class="com.day.crx.persistence.tar.TarJournal"/>
</Cluster>