In the past, the default dispatcher configuration was that any query parameter circumvents the dispatcher cache. To combat that, developers could configure particular query parameters to be ignored (things like utm_source, etc).
Recently, Adobe is changed dispatcher recommendations to ignore ALL parameters by default, only breaking cache explicitly for known query parameters. See https://github.com/adobe/aem-dispatcher-optimizer-tool/blob/main/docs/Rules.md#dot---the-dispatcher-...
I'm not sure I like this. Sure, it increases the default cache hit ratio, but that was something we already accomplish in the past by configuring query parameters to be ignored. My concerns with this change are:
As a quick example for bullet #3, consider a service that confirms a user signup with an emailed link that includes a UUID pointing to the registration. The first person to click the link works fine. The second person to click the link ends up not only failing to complete registration, but the seeing whatever the cached result from user 1 may have been (which could include account information)
For bullet #1, yes devs can be trained and it's not that big a deal to update the dispatcher since the code sits in the same codebase, but I really am wary of any setup where security (bullet #3) is breached by default unless the developer does actually remember.
I think I'd be less concerned with this change if the dispatcher didn't pass through any ignored query parameters to the publish server. That way any functionality based on query parameter would fail for all users including the first user, making issues easier to catch in Staging.
Curious what others' thoughts are. Am I missing something?
Hi Brett,
Thanks for sharing these concerns - appreciate your perspective on this. We anticipated questions about this rule and created the first video in our DOT "rule explainer" series to go over our motivation for the ignoreUrlParams rule: https://youtu.be/jgs_0EitZGg
Sure, it increases the default cache hit ratio, but that was something we already accomplish in the past by configuring query parameters to be ignored
In our experience we've found that unexpected/unforeseen query params often end up included on links to the public facing site. This can happen automatically with ad campaigns, or when links to the site are sent via an email system. If these links reach a large enough audience they can lead to a slower experience for the users of the links, and at worst (given enough volume) can consume resources on the publish tier.
It requires the dispatcher configs to be updated if a new query string is added to code, and that is not expected by most developers
If developers are not testing their code and components via a local dispatcher, and end users of the site can only access content via a dispatcher, are the developers really testing the site as it would be used in production? We know that the dispatcher configuration is often an afterthought, and understand that it adds an additional layer to a developers setup, but we do think that it's an important layer which is worthy of consideration throughout the development of the site. To help with this, we've written instructions as part of our lab-format AEM Dispatcher Experiments repo to help devs get a local dispatcher set up on both macOS and Windows:
- macOS: https://github.com/adobe/aem-dispatcher-experiments/blob/main/docs/Local-Dispatcher-macOS.md
- Windows: https://github.com/adobe/aem-dispatcher-experiments/blob/main/docs/Local-Dispatcher-Windows.md
It does not prevent DDOS attacks (which I've seen Adobe purport that it does) - it just reduces which query params can be used to attack
Agree 100% that it does not prevent DDOS. In our experience, it really just helps limit a form of self-inflicted DDOS, where often legitimate links to the site (as described above) can contain query params that count as cache misses.
It opens the door (or at least seems to) to very serious security concerns [...] The first person to click the link works fine. The second person to click the link ends up not only failing to complete registration, but the seeing whatever the cached result from user 1 may have been (which could include account information)
For content like this which should never be cached no matter what, the "Dispatcher: no-cache" header can be set on the response.
Views
Replies
Total Likes
Thanks @Bruce_Lefebvre for your response. I understand everything you're saying, and it sounds good on paper, but the realities of AEM development I feel make this a little less perfect as it may all sound.
@Bruce_Lefebvrewrote:If developers are not testing their code and components via a local dispatcher, and end users of the site can only access content via a dispatcher, are the developers really testing the site as it would be used in production?
Yep, you're right, and I can't argue your "correctness". But the reality of AEM development is that it's far more onerous for local developers to run a full author, publish, dispatcher stack locally, keep content in sync across them, and deploy code to all environments for all changes as they work. Pair that with the fact that 99% of everything that works in testing on a simple author server works in production, and that there is QA/UAT on staging servers to catch the other 1%, and there's just no convincing developers to run a full stack. I've seen multiple vagrant/docker setups evangelized to help developers run a full stack, but developers continue to double click a jar :). Sorry, I venture to say that 90% of AEM developers do not and will not run a full stack locally, due to productivity considerations.
@Bruce_Lefebvrewrote:For content like this which should never be cached no matter what, the "Dispatcher: no-cache" header can be set on the response.
This doesn't change my major concern...that the system is NON-secure by default, and requires smart developers (or smart code reviewers) to be sure to add these no-cache headers whenever they are needed. Security-conscious, expert software devs may remember this 90% of the time (still not perfect) and junior/mid-levels likely won't even think about it, regardless of how much security training they've received.
The best systems are secure by default. Adobe said it itself when HTL (with auto-escaping) was created to replace JSPs.
What I feel is happening here is that Adobe is trading security risks (which would get blamed on the implementer) in an effort to improve perceptions around the AEM stack performance/cachability (which would get blamed on Adobe).
As I said before, I'd be less concerned with this change if the dispatcher didn't pass through any ignored query parameters to the publish server. It wouldn't fix the fact that developers will nearly always forget to update the dispatcher to allow their new query parameters, but it would at least prevent the security concern.
This doesn't change my major concern...that the system is NON-secure by default [..] In the past, the default dispatcher configuration was that any query parameter circumvents the dispatcher cache.
Just to clarify: the default configuration — included when a project is bootstrapped with the AEM archetype — has not changed (ignoreUrlParams remains commented out). We acknowledge that the creation and maintenance of the allow list rules (ignoreUrlParams, Sling selectors, and Sling suffixes) is going to take additional effort, likely beyond what dev teams are doing today. These rules, as the name of the tool suggests, are intended to optimize your use of the dispatcher.
As we were building the tool we also acknowledged that not all of these rules are going to work for everyone, so we created an extension mechanism that can be used to fine tune the rule set (or replace it altogether) depending on your specific needs. To show how this works, I’ve added a branch to a sample project which disables the ignoreUrlParams rule via the rule extension mechanism: https://github.com/blefebvre/aem-dot-example-project/compare/disable-ignore-url-params-rule
This extension mechanism, along with other features of the DOT, is covered in a lab-format exercise we put together to get folks comfortable with the tool: https://github.com/adobe/aem-dispatcher-experiments/tree/main/experiments/optimizer
But the reality of AEM development is that it's far more onerous for local developers to run a full author, publish, dispatcher stack locally, keep content in sync across them, and deploy code to all environments for all changes as they work. [..] there's just no convincing developers to run a full stack.
We’re open to ideas on how we can make this easier. Currently, the archetype supports the `mvn install -PautoInstallSinglePackage -PautoInstallSinglePackagePublish` flags which will deploy packages to both author and publish instances in a single command. Would an official Docker container facilitate running a local dispatcher? Are there other ways we can make the use of the dispatcher easier and more commonplace for day-to-day development?
consider a service that confirms a user signup with an emailed link that includes a UUID pointing to the registration
An additional thought on your example: if this service were implemented as a servlet (such as this Create Servlet example, although using "sling.servlet.methods=get" instead), it's worth noting that responses to requests without extensions will still not be cached by the dispatcher - regardless of the configuration of the ignoreUrlParams rule.
@Bruce_Lefebvre wrote:As we were building the tool we also acknowledged that not all of these rules are going to work for everyone, so we created an extension mechanism that can be used to fine tune the rule set (or replace it altogether) depending on your specific needs.
...
This extension mechanism, along with other features of the DOT, is covered in a lab-format exercise we put together to get folks comfortable with the tool: https://github.com/adobe/aem-dispatcher-experiments/tree/main/experiments/optimizer
That's awesome that there are some built in ways to overcome the default validation rules! I think that definitely allows us to have our teams act in a way that we feel is more secure by default. That said, I still have a concern that many teams will follow through on the default rules and open a potential security hole that will ultimately put their teams to blame.
@Bruce_Lefebvre wrote:We’re open to ideas on how we can make this easier. Currently, the archetype supports the `mvn install -PautoInstallSinglePackage -PautoInstallSinglePackagePublish` flags which will deploy packages to both author and publish instances in a single command. Would an official Docker container facilitate running a local dispatcher? Are there other ways we can make the use of the dispatcher easier and more commonplace for day-to-day development?
We've developed Vagrant and Docker setups internally, but it's clear to me that it's just not something the developer community wants. Even if a full build can push to both servers at the same time, most devs are using quicker methods to deploy code (particularly FE code) without maven builds. And even if mechanisms are available to sync FE code to both AEM servers simultaneously, and the dispatcher configs as well, there's still a reality of dealing with 3 servers with 3 sync processes (i.e. 3 points of failure) that will inevitably experience random issues. That's not to mention the need to keep content in sync across author/pub as well. Again, I get that "by the book" developers should be using a full 3-server setup, but my experience in the field is that double clicking an author jar is just far, far too tempting for all but the most devout.
@Bruce_Lefebvre wrote:An additional thought on your example: if this service were implemented as a servlet (such as this Create Servlet example, although using "sling.servlet.methods=get" instead), it's worth noting that responses to requests without extensions will still not be cached by the dispatcher - regardless of the configuration of the ignoreUrlParams rule.
Though this is a valid implementation recommendation, it doesn't address my concern with "secure by default" and in that way is very similar to the recommendation to add a "no-cache" header to the service.
It seems we may have to "agree to disagree" on some of these topics. That said, what are your thoughts on why the dispatcher forwards query parameters ignored for caching purposes to the AEM server? Given that it only forwards the parameter onto the AEM server in the case of a non-cached request, it means that the system is indeterminate (in the end user's perspective) as to whether a query parameter will or wont be used on a given request. If URL parameters ignored for caching purposes were also stripped from the AEM request, such that they NEVER get to AEM, then I can see merit in this new default rule for /ignoreUrlParams.
@BrettBirschbach wrote:That said, what are your thoughts on why the dispatcher forwards query parameters ignored for caching purposes to the AEM server?
Good question. I've raised this internally and will share an update when I know more.
Thanks again for sharing your concerns about this rule. It's led to a good discussion on our team, and we've updated the ignoreUrlParams rule description with a "Security Caveat" section.
Views
Likes
Replies
Views
Likes
Replies