Good day all,
As of this morning our system is experiencing an issue with the default replication to one of the publishers (the other is fine). No code or configuration changes have been made in the last days that could have caused it.
The status of the replication agent on the author is "blocked". When I test the connection of the agent from the author, it returns an HTTP 403 - Forbidden code.
It's not a network connectivity issue. I've tested this. I also checked out other common causes, but found no solution there (explained here https://experienceleague.adobe.com/en/docs/experience-manager-65/content/implementing/deploying/conf...).
On the publisher side, I see the following error appear in the log when testing the connection (full error attached):
Any help with this issue would be much appreciated.
Solved! Go to Solution.
Topics help categorize Community content and increase your ability to discover relevant content.
Views
Replies
Total Likes
I've found the issue and have managed to resolve it. I don't know what caused it yet thought. For completeness I'll post the steps I took and solution that solved it.
So after some more investigating, I found out that POST requests never made it to the ReplicationServlet responsible for handling them. GET requests did though. And authentication through the GET request worked fine. So authentication worked fine.
Then I put the ROOT logger on TRACE for a short while, did another connection test, and found out that the CSRF filter was being triggered and was rejecting the POST request because there was no token.
Then I looked up the current CSRF filter configurations and found out that somehow the configuration for one of the publishers had been changed (see screenshots) to always check for a CSRF token, even on non-browser agents.
After correcting the configuration, replication is now fixed.
The question I have now though is how this could have been changed. Does anyone know how I can find this out?
Could you check in status of com.day.cq.replication.impl.servlets.ReplicationServlet
/system/console/components in the failing publisher
I see I overlooked this comment earlier. The ReplicationServlet is active and the configuration seems to be correct:
Could you also follow the steps mentioned in this article
https://experienceleague.adobe.com/en/docs/experience-cloud-kcs/kbarticles/ka-17467
Thanks for your reply and the link. Unfortunately, none of the resolutions there worked in my case. The issue is still present.
The log which you have posted does not seem like an error, it looks just like a debug statement. And 403 probably just implies that the configured transport user or Agent User ID probably does not have enough permissions. I believe your transport user will also have to be available on the publish instance on which you are trying to replicate. So, just cross check once on that
Thanks for you reply. You're right that it's not an error. It's a warning statement and a debug statement that goes into a bit more details about the warning.
The replication user is present on the publish instance and also has the correct access. Just as a test I also tried using the admin user and that yielded the same error. So I don't think it's an issue with access rights.
Could you post the Test connection log once for both of your publishers (where it is fine and where it is not) ?
This is test connection log for the failing publisher:
And this is the test connection log for the publisher that is succeeding:
Replication test to http://publish02:4503/bin/receive?sling:authRequestLogin=1
2024-06-19 16:56:15 - Create new HttpClient for Publish Agent
2024-06-19 16:56:15 - * Auth User: replication-receiver
2024-06-19 16:56:15 - * HTTP Version: 1.1
2024-06-19 16:56:15 - * Connect Timeout: 900000
2024-06-19 16:56:15 - * Socket Timeout: 900000
2024-06-19 16:56:15 - adding header: Action:Test
2024-06-19 16:56:15 - adding header: Path:/content
2024-06-19 16:56:15 - adding header: Handle:/content
2024-06-19 16:56:15 - deserialize content for delivery
2024-06-19 16:56:15 - No message body: Content ReplicationContent.VOID is empty
2024-06-19 16:56:15 - Sending POST request to http://publish02:4503/bin/receive?sling:authRequestLogin=1
2024-06-19 16:56:15 - sent. Response: 200 OK
2024-06-19 16:56:15 - ------------------------------------------------
2024-06-19 16:56:15 - Sending message to publish02:4503
2024-06-19 16:56:15 - >> POST /bin/receive HTTP/1.0
2024-06-19 16:56:15 - >> Action: Test
2024-06-19 16:56:15 - >> Path: /content
2024-06-19 16:56:15 - >> Handle: /content
2024-06-19 16:56:15 - >> Referer: about:blank
2024-06-19 16:56:15 - >> Content-Length: 0
2024-06-19 16:56:15 - >> Content-Type: application/octet-stream
2024-06-19 16:56:15 - --
2024-06-19 16:56:15 - << HTTP/1.1 200 OK
2024-06-19 16:56:15 - << Date: Wed, 19 Jun 2024 16:56:15 GMT
2024-06-19 16:56:15 - << X-Content-Type-Options: nosniff
2024-06-19 16:56:15 - << X-Frame-Options: SAMEORIGIN
2024-06-19 16:56:15 - << Content-Type: text/plain;charset=utf-8
2024-06-19 16:56:15 - << Content-Length: 26
2024-06-19 16:56:15 - <<
2024-06-19 16:56:15 - << ReplicationAction TEST ok.
2024-06-19 16:56:15 - Message sent.
2024-06-19 16:56:15 - ------------------------------------------------
2024-06-19 16:56:15 - Replication (TEST) of /content successful.
Replication test succeeded
Nothing interesting here. You could try creating a new user altogether and give it a shot, if somehow users are impacted on your publish01.
Even when creating a new user, the error is the same.
I've found the issue and have managed to resolve it. I don't know what caused it yet thought. For completeness I'll post the steps I took and solution that solved it.
So after some more investigating, I found out that POST requests never made it to the ReplicationServlet responsible for handling them. GET requests did though. And authentication through the GET request worked fine. So authentication worked fine.
Then I put the ROOT logger on TRACE for a short while, did another connection test, and found out that the CSRF filter was being triggered and was rejecting the POST request because there was no token.
Then I looked up the current CSRF filter configurations and found out that somehow the configuration for one of the publishers had been changed (see screenshots) to always check for a CSRF token, even on non-browser agents.
After correcting the configuration, replication is now fixed.
The question I have now though is how this could have been changed. Does anyone know how I can find this out?
Hi @SvenDev,
You can refer to the article - How to track down unexpected OSGi configuration updates
Hope this helps!
Views
Like
Replies
Views
Likes
Replies