Expand my Community achievements bar.

Don’t miss the AEM Skill Exchange in SF on Nov 14—hear from industry leaders, learn best practices, and enhance your AEM strategy with practical tips.
SOLVED

Shared s3 blobstore maintenance | AEM 6.5 on-prem

Avatar

Employee

AEM 6.5 on-prem | Shared S3 blobstore :
Blobstore maintenance has not been run in a long time. Now there are 100s of millions of objects in the store which would make running the maintenance tasks complicated.
1. Are there any complications involved while running the maintenance tasks now, since they have not been run for a long time. Have you faced any issues while running garbage collection, revision cleanup, other maintenance tasks with shared S3 datastore if they have not been run for very long.
2. Recommendations/ best practices for garbage collection / revision clean up for shared S3 blobstore since these activities have not been performed for a very long time.

 

1 Accepted Solution

Avatar

Correct answer by
Level 6

Hi @Himanshu_Phulara ,

 

One known issue that our team has faced before is with missing blob IDs in a shared S3 datastore architecture. This may or may not be applicable.

 

Typically, after maintenance you would start seeing this message in the error.log

"Error occurred while obtaining InputStream for blobId".

 

In such cases, you can go through the below steps:

  • Run a datastore consistency check.

 

java -jar oak-run-*.jar datastorecheck --consistency -ref --id --s3ds crx-quickstart/install/org.apache.jackrabbit.oak.plugins.blob.datastore.S3DataStore.config --repoHome crx-quickstart/repository --store crx-quickstart/repository/datastore --dump temp --verbose --track

 

 

  • Use the below command to grep the missing blob Ids from the error.log file:

 

grep "Error occurred while obtaining InputStream for blobId" error.log* | grep -Eo "[0-9a-f]{40,200}" | awk '{ print substr($1, 0,2) "/" substr($1, 3,2) "/" substr($1, 5,2) "/" $1 }' | sort -u > missing_ds_files.txt

 

  • After executing this command you will have a list of the missing blobs in the file missing_ds_files.txt and now you can recover these missing records from the S3 bucket or create a dummy folder for each.

Unfortunately, the document that we referred for these instructions is giving a 404 so I would suggest reaching out to Adobe support for any missing information. 

https://helpx.adobe.com/experience-manager/kb/oak-blobstore-inconsistency-blobId.html

 

For performing the datastore GC, please refer the official documentation:

Configuring node stores and data stores in AEM 6 | Adobe Experience Manager

 

Thanks,

Ram

 

View solution in original post

3 Replies

Avatar

Employee

Hi Varun, I have this documentation link, what I am looking for is if someone has actually worked on it and can point out issues that might arise, and recommendations around it.

Avatar

Correct answer by
Level 6

Hi @Himanshu_Phulara ,

 

One known issue that our team has faced before is with missing blob IDs in a shared S3 datastore architecture. This may or may not be applicable.

 

Typically, after maintenance you would start seeing this message in the error.log

"Error occurred while obtaining InputStream for blobId".

 

In such cases, you can go through the below steps:

  • Run a datastore consistency check.

 

java -jar oak-run-*.jar datastorecheck --consistency -ref --id --s3ds crx-quickstart/install/org.apache.jackrabbit.oak.plugins.blob.datastore.S3DataStore.config --repoHome crx-quickstart/repository --store crx-quickstart/repository/datastore --dump temp --verbose --track

 

 

  • Use the below command to grep the missing blob Ids from the error.log file:

 

grep "Error occurred while obtaining InputStream for blobId" error.log* | grep -Eo "[0-9a-f]{40,200}" | awk '{ print substr($1, 0,2) "/" substr($1, 3,2) "/" substr($1, 5,2) "/" $1 }' | sort -u > missing_ds_files.txt

 

  • After executing this command you will have a list of the missing blobs in the file missing_ds_files.txt and now you can recover these missing records from the S3 bucket or create a dummy folder for each.

Unfortunately, the document that we referred for these instructions is giving a 404 so I would suggest reaching out to Adobe support for any missing information. 

https://helpx.adobe.com/experience-manager/kb/oak-blobstore-inconsistency-blobId.html

 

For performing the datastore GC, please refer the official documentation:

Configuring node stores and data stores in AEM 6 | Adobe Experience Manager

 

Thanks,

Ram