Your achievements

Level 1

0% to

Level 2

Tip /
Sign in

Sign in to Community

to gain points, level up, and earn exciting badges like the new
BedrockMission!

Learn more

View all

Sign in to view all badges

SOLVED

How to get the path of the page being published?

AEM_Dev_Newbie
Level 2
Level 2

I need to hit a servlet whenever the author publishes any content on AEM. I need to do it at the end of the publish workflow such that the servlet is hit only after all the approval processes are complete and AEM finally replicates the content to the publish instance.

I need to do it for all ways of publishing content in AEM. Some of the most common ways are shown below (using AEM 6.5)

AEM_Dev_Newbie_0-1620298001801.png

AEM_Dev_Newbie_1-1620298257674.png

We have tried adding our own replication agent under "Agents on author" as shown below:

AEM_Dev_Newbie_4-1620298584487.png       AEM_Dev_Newbie_5-1620298595280.png

Pointed the agent to our custom servlet and the servlet is being hit but there seems to be absolutely no way to determine the page path from where the author has initiated publish. We require the page path inside the servlet too.

Therefore the only way seems to be through Jquery where on clicking the publish button an ajax call is made to the servlet and we are able to capture the current page path this way. This approach is incorrect since the ajax call is independent of the inbuilt AEM publish workflow and the servlet will be called even if the page publish is rejected by any approvers in the workflow.

The correct way would be to call the servlet at the end of the default AEM replication/activation/publish workflow but we also need the path of the page which was published/activated which seems impossible. Is there any way to call our servlet during the default AEM activation flow?

1 Accepted Solution
Jörg_Hoh
Correct answer by
Employee
Employee

So what are trying to achieve? I understand it in a way, that you want to get the path of the page which is being published, is this correct? I ask because you mention at one point, that you need to extract the page the author is currently on. And while that might be the same in some cases, it's not required at all.

 

Then the next question: At which point do you need to your code to run? Before the replication is initiated? When the replication has completed? When the payload has reached all publishs?

View solution in original post

6 Replies
markus_bulla_adobe
Employee
Employee

Hi @AEM_Dev_Newbie!

 

I see two approaches to your requirement: 

  • A custom replication agent [1] that points to your servlet
    The request to your serlvet will contain the path to the activated element for the dispatcher flush agent (please refer to the headers of the HTTP request, namely the "CQ-Handle" header) or contain the full page content and information (in a serialized format) for the regular publication agent. If you just need the path, the dispatcher flush agent seems the better fit here. You can also write a custom replication agent. This agent is usually inactive on the author tier and only active for publish instances. But if you have a requirement for it to run on the author instance, there is no issue with that.
    Please note: replication agents are not supported for AEM as a Cloud Service.
  • A workflow launcher that listens on modifications of the cq:lastPublished property of your pages
    You can define a workflow launcher [2] that will listen to changes to the content and start a workflow whenever this happens. Your workflow could then either send the page (path) to your servlet or already hold the according logic that is currently implemented in your servlet. The launcher can be restricted to a certain path/content tree, to specific resourceTypes and only listen for changes to specific properties (the cq:lastPublication date in this case).

As you already mentioned, IMO it would not be an appropriate approach to handle this on the front end (jquery or similar).

 

Hope that helps!

 

[1] https://experienceleague.adobe.com/docs/experience-manager-65/deploying/configuring/replication.html

[2] https://experienceleague.adobe.com/docs/experience-manager-65/administering/operations/workflows-sta...

AEM_Dev_Newbie
Level 2
Level 2
Thank you. We created a new replication agent and added a header with the path and were able to capture it on the backend. The problem is that now we are trying to get the path of a collection but it is providing only the path of the image being used in the collection.
Jörg_Hoh
Correct answer by
Employee
Employee

So what are trying to achieve? I understand it in a way, that you want to get the path of the page which is being published, is this correct? I ask because you mention at one point, that you need to extract the page the author is currently on. And while that might be the same in some cases, it's not required at all.

 

Then the next question: At which point do you need to your code to run? Before the replication is initiated? When the replication has completed? When the payload has reached all publishs?

View solution in original post

AEM_Dev_Newbie
Level 2
Level 2
So in this project AEM is used as a headless CMS so basically all we need is the content from it. We are calling an endpoint which will index the page that the author just made changes on, but it should not index the page unless the changes are completely approved for publishing. We need to call our code at the end of the approval/publish workflow where we can be absolutely sure that this page can be indexed now since all the changes are approved.
Jörg_Hoh
Employee
Employee

Do you plan to fetch this data directly from author, and will the index job crawl this from publish?

I would implement it this way:

  1. You need an approval workflow, which also includes the replication of the approved content.
  2. As a last step in the workflow you add a notification to the external system, that it can crawl the content on author (or you package it and send it to the external system, your choice).

 

In that case the workflow payload is either the page, where it is invoked on, or a workflow package (a set of pages). In the first case you can get the payload path directly, while in the second case you need to use the ResourceCollectionManager service to list the content of a workflow package. See https://github.com/Adobe-Consulting-Services/acs-aem-commons/blob/master/bundle/src/main/java/com/ad...for an example how it can be done.