Expand my Community achievements bar.

Guidelines for the Responsible Use of Generative AI in the Experience Cloud Community.

Programmatic replication not working in AEM 6.5

Avatar

Level 5

Hi,

We are replicating content to one of the publish instance via custom workflow step (sling:resourceType=dam/components/workflow/publish-with-replication-agent). We also have dedicated replication agent for this purpose and the connection is successful.\

In custom workflow process step, we had given value for field Replication Agent ID as replication agent name i.e if replication agent is https://<<hostname>>/etc/replication/agents.author/publish3.html, then Replication Agent ID=publish3.
While initiating this workflow for a page, we see below exception in error log

16.03.2021 07:26:28.961 *ERROR* [JobHandler: /var/workflow/instances/server2/2021-03-15/publish-to-china-server_2:/content/mycompany/zh/home] org.apache.jackrabbit.oak.plugins.blob.AbstractSharedCachingDataStore Error retrieving record [3b6ae754b20a8050a6683bf8f705db91dbdf5074cf7892b50d92f4a7a176dbca]
org.apache.jackrabbit.core.data.DataStoreException: com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: DB09G0BSZH60YVSB; S3 Extended Request ID: wsxDwz+E9FkqFoA9AvlAvfwz5IGFAgUeMYFJv7rw9gkYLL0OZ+5Hb3o7py8njQkV0bOQ1dM34nA=), S3 Extended Request ID: wsxDwz+E9FkqFoA9AvlAvfwz5IGFAgUeMYFJv7rw9gkYLL0OZ+5Hb3o7py8njQkV0bOQ1dM34nA=
at org.apache.jackrabbit.oak.blob.cloud.s3.S3Backend.getRecord(S3Backend.java:638) [org.apache.jackrabbit.oak-blob-cloud:1.10.5]
at org.apache.jackrabbit.oak.plugins.blob.AbstractSharedCachingDataStore.getRecordIfStored(AbstractSharedCachingDataStore.java:210) [org.apache.jackrabbit.oak-blob-plugins:1.22.3]
at org.apache.jackrabbit.oak.plugins.blob.AbstractSharedCachingDataStore.getRecord(AbstractSharedCachingDataStore.java:188)

Topics

Topics help categorize Community content and increase your ability to discover relevant content.

10 Replies

Avatar

Community Advisor

Hi @srikanthpogula ,

 

Need more information for a clear understanding. 

  1. What kind of session has been used to replicate programmatically.
  2. If you are using the system user to maintain the session then this system user should have the appropriate access.
  3. Or, is there any other way, this replication has been implemented.

Avatar

Level 5

Hi @Rohit_Utreja, Here is the flow:

  • Content author clicks on Start Workflow and chooses custom workflow model which contains custom process step
  • In custom workflow process step, we are getting replicationAgent which is set as agent filter in ReplicationOptions
  • We are using user.jcr.session while initializing ResourceResolver
  • We also have a replication agent which pulishes content to publish instance and this agent is ignored on normal replication

 

Avatar

Community Advisor

@srikanthpogula 
Try using system user instead of generic jcr session and provide appropriate permissions to this system user.

Avatar

Community Advisor

Hello @srikanthpogula  try this :
1. shut down the AEM instance
2. run offline TAR compaction
3. restart AEM instance

Also, please take a look at your system-user whether it has necessary access permissions. Post all this, if this doesn't work, could you share the complete log(once you replicate the content/start of the workflow process)

Thanks!

Avatar

Employee Advisor

Checking the error[1] , it seems like an issue with the S3 config. Do you see the same issue when replication is triggered manually?

 

Also, setup a DEBUG logger on "com.day.cq.replication" to see if the issue is related to the replication bundle. Lastly, as others mentioned, try adding the service user used for workflow in the adminstrators group.

 

 

[1] 16.03.2021 07:26:28.961 *ERROR* [JobHandler: /var/workflow/instances/server2/2021-03-15/publish-to-china-server_2:/content/mycompany/zh/home] org.apache.jackrabbit.oak.plugins.blob.AbstractSharedCachingDataStore Error retrieving record [3b6ae754b20a8050a6683bf8f705db91dbdf5074cf7892b50d92f4a7a176dbca]
org.apache.jackrabbit.core.data.DataStoreException: com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: DB09G0BSZH60YVSB; S3 Extended Request ID:

Avatar

Level 5
Hi @Jaideep_Brar, the replication agent is configured to Ignore Default behavior, i will check if i can disable this and replicate manually

Avatar

Level 5

Hi,

Can you check the s3 bucket configuration, I can see 404 in it, 

org.apache.jackrabbit.core.data.DataStoreException: com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found;

Avatar

Employee Advisor

Hi,

 

Can you please share complete stacktrace? It seems that at some point the workflow is accessing the S3 datastore, but it cannot find a binary. It would be interesting to know, at which point in the process it fails to identify the missing blob. 

Does this error re-appear, if you try it some minutes later at the same path with the same settings?

Avatar

Level 5

Hi @Jorg_Hoh, the issue did not occur again once we have rolled-out content again from master copy. It appears to be due to page nodes getting corrupted. We were able to publish content using the workflow successfully.

Though i have one question on workflow step Publish with Replication Agent. What is the value for field Replication Agent ID? Is it replication agent name?

Can anyone help me with this information.

Thanks

Srikanth Pogula 

Avatar

Employee Advisor
With the agentId you can specify which agent to use. If you leave it empty, all agents are used, which is normally what you want.