Expand my Community achievements bar.

Guidelines for the Responsible Use of Generative AI in the Experience Cloud Community.
SOLVED

AEM 6.5.11 - Replication Path Transformer is not working

Avatar

Level 2

We are using AEM 6.5.11. Architecture considers 1 Author connected to 2 Publishers, each Publisher connected to a corresponding dispatcher with a load balancer application gateway seating in front to balance out the load between the 2 dispatchers. There is no CDN in the ecosystem.

 

We are using Resource Resolver Factory settings for URL shortening to eliminate /content/mysite, /content/experience-fragments, and /content/dam/mysite from the URL paths on Publisher as well as on dispatcher.

 

The obvious side-effect is, on replication, the change is not taking effect on the end site as cache invalidation is not working due to shortening URLs. The dispatcher flush is not able to find the long path for invalidation and hence not able to invalidate it for updating from Publisher in the next request. Since we do not have CDN in consideration, I am not using custom ContentBuilder, rather I am using ReplicationPathTransformer to manipulate the path if the path contains "mysite" (so that other projects sharing the same instance don't get affected). I see the bundle Activated and the Replication Transformer class has been assigned Service ID. So I assume that it is loaded well. However, on replicating from the author, nothing gets triggered on Publisher project logs (which is getting updated properly) at DEBUG level.

 

We are not using ACS Commons as of now in our project, so not thinking of ACS Dispatcher Flush rules.

 

 

 

package xyz.core.service.impl;

import javax.jcr.Session;
import org.apache.commons.lang3.StringUtils;
import org.apache.sling.models.annotations.injectorspecific.OSGiService;
import org.osgi.service.component.annotations.Component;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.day.cq.replication.Agent;
import com.day.cq.replication.ReplicationAction;
import com.day.cq.replication.ReplicationActionType;
import com.day.cq.replication.ReplicationPathTransformer;
import xyz.core.service.OsgiConfigurationService;
import xyz.core.utils.UrlUtil;

@Component(
service = MySiteReplicationPathTransformer.class,
name = "MySiteReplicationPathTransformer",
immediate = true)
public class MySiteReplicationPathTransformer implements ReplicationPathTransformer {

@OSGiService
OsgiConfigurationService osgiService;

public static final String MYSITE_PREFIX = "mysite";

/** The Logger */
private static final Logger LOGGER = LoggerFactory.getLogger(MySiteReplicationPathTransformer.class);

private String stripPageUrl;
private String stripFrgmntUrl;
private String stripAssetUrl;

@Override
public String transform(Session session, String replicationPath,
ReplicationAction replicationAction, Agent agent) {

LOGGER.debug("MySite Replication Path Transformer has been triggered successfully.");

if (StringUtils.isNotEmpty(replicationPath)
&& (StringUtils.contains(replicationPath, MYSITE_PREFIX))) {

LOGGER.debug("Replication Path Transformer: Source Path: {}", replicationPath);

if (null != osgiService) {
stripPageUrl = osgiService.getProperty(UrlUtil.STRING_PID_PROP,
UrlUtil.STRING_STRIPASSET_PROP, StringUtils.EMPTY);
stripFrgmntUrl = osgiService.getProperty(UrlUtil.STRING_PID_PROP,
UrlUtil.STRING_STRIPASSET_PROP, StringUtils.EMPTY);
stripAssetUrl = osgiService.getProperty(UrlUtil.STRING_PID_PROP,
UrlUtil.STRING_STRIPASSET_PROP, StringUtils.EMPTY);
}

// Content Pages
if (StringUtils.contains(replicationPath, stripPageUrl)) {
replicationPath = replicationPath.replaceAll(stripPageUrl, StringUtils.EMPTY);

// Experience Fragments
} else if (StringUtils.contains(replicationPath, stripFrgmntUrl)) {
replicationPath = replicationPath.replaceAll(stripFrgmntUrl, StringUtils.EMPTY);

// Digital Assets
} else if (StringUtils.contains(replicationPath, stripAssetUrl)) {
replicationPath = replicationPath.replaceAll(stripAssetUrl, StringUtils.EMPTY);
}

LOGGER.debug("Replication Path Transformer: Processed Path: {}", replicationPath);
}

return replicationPath;
}


@Override
public boolean accepts(Session session,
ReplicationAction replicationAction, Agent agent) {

/*
* Check if the agent is a dispatcher agent
* if it is the agent you are targeting, return true
*/
if (replicationAction.getType().equals(ReplicationActionType.ACTIVATE)
|| replicationAction.getType().equals(ReplicationActionType.DELETE)) {
return true;
}
return false;
}

}

 

 

1 Accepted Solution

Avatar

Correct answer by
Level 4

Re: "My problem here is clientlibs are not loading and giving 404 although..."

I wonder if you need further rewrites, this doesn't seem to relate exactly to your issue, but we have rewrites similar to:

RewriteCond %{REQUEST_URI} !^/(dam|site-name|content|common|etc|apps|dispatcher)
RewriteRule ^/(.*) /content/site-name/en/$1 [NE,PT]

<LocationMatch "^/(dispatcher|en|content|errors|dam|common|etc|system|site-name|sites-section|libs/granite/csrf/token.json)">
    SetHandler dispatcher-handler
    ModMimeUsePathInfo On
</LocationMatch>

<LocationMatch "^/$">
    SetHandler dispatcher-handler
    ModMimeUsePathInfo On
</LocationMatch>

 

 

 

 

 

 

View solution in original post

3 Replies

Avatar

Level 4

I don't know if this applies to your situation with ReplicationPathTransformer but - have you tried sending cache invalidation requests from the publish instances to the dispatchers with cURL to see what happens - try changing the CQ-Handle to match a page you want to invalidate?

Something like: 

curl -H "CQ-Handle: /content" -H "CQ-Path: /content" https://<web-server-dispatcher>/dispatcher/invalidate.cache

Are you using .stat files - do they get created in your cache directory/docroot?

Have you tried using a new default dispatcher flush agent (on publish, publish the flush agent from the author to the publish instances)? You can then access the agent on the publish instances to test/check log link in the agent. I needed to change GET to POST in my flush agent. Make sure all the headers needed and triggers are specified in the flush agent config.

Maybe you could add some rewrites to your site's Apache config to deal with the real location of the cache - 

https://www.netcentric.biz/insights/2017/01/aem-dispatcher-cache-invalidation-for-multiple-cache-far...

https://www.albinsblog.com/2016/06/dispatcher-cache-invalidation-for-multisite-configuration-AEM-CQ5...

like:

<LocationMatch "^/dispatcher/invalidate.cache$">
     # domain A
     SetEnvIf CQ-Path ".*/content-path-of-domain-A/.*" FLUSH_HOST=domain-A
     RequestHeader set Host %{FLUSH_HOST}e env=FLUSH_HOST
     # domain B
     SetEnvIf CQ-Path ".*/content-path-of-domain-B/.*" FLUSH_HOST=domain-B
     RequestHeader set Host %{FLUSH_HOST}e env=FLUSH_HOST
 </LocationMatch>

You might need a rewrite like:

# rewrite for root redirect
RewriteRule ^/?$ /content/${CONTENT_FOLDER_NAME}/us/en.html [PT,L]

https://experienceleaguecommunities.adobe.com/t5/adobe-experience-manager/aem-6-5-dispatcher-rewrite...

https://aemhub.blogspot.com/2020/12/basic-dispatcher-url-rewrites-rules.html

 

Thanks for your detailed reply.

 

Yes I do have .stat file in docroot, since my stat file level is 0.

I tried modifying the dispatcher flush agent on AEM publisher to provide the specific page path with .html extension hard-coded for 1 page instead of {path} parameter and then it worked out fine.

 

So 2 ways I can approach this -

1. Add sling mapper with Internal redirect and add the last rule you have stated which basically caches pages, experience-fragments and DAM assets by its original path, so the default flush agent works as-is. My problem here is clientlibs are not loading and giving 404 although in filters.any I have got /etc.clientlibs allowed and /etc/map.stage.publish/https has clientlibs match etc[.]clientlibs/(.+) and redirect /etc.clientlibs/$1 with anonymous user given read access on /etc/map.stage.publish.

 

2. Somehow I intercept and tweak the payload path on Replication and ACTIVATE the default dispatcher flush to do the cache invalidation on stripped off path - /us/en/global/home rather than /content/mysite/us/en/global/home.

 

At the moment I am stuck with both the ways.

Avatar

Correct answer by
Level 4

Re: "My problem here is clientlibs are not loading and giving 404 although..."

I wonder if you need further rewrites, this doesn't seem to relate exactly to your issue, but we have rewrites similar to:

RewriteCond %{REQUEST_URI} !^/(dam|site-name|content|common|etc|apps|dispatcher)
RewriteRule ^/(.*) /content/site-name/en/$1 [NE,PT]

<LocationMatch "^/(dispatcher|en|content|errors|dam|common|etc|system|site-name|sites-section|libs/granite/csrf/token.json)">
    SetHandler dispatcher-handler
    ModMimeUsePathInfo On
</LocationMatch>

<LocationMatch "^/$">
    SetHandler dispatcher-handler
    ModMimeUsePathInfo On
</LocationMatch>