Implementing Structured Data with Googlebot IP Verification on AEM as a Cloud Service | Community
Skip to main content
Level 2
May 23, 2025
Solved

Implementing Structured Data with Googlebot IP Verification on AEM as a Cloud Service

  • May 23, 2025
  • 1 reply
  • 693 views

Hi everyone,

I’m working on implementing structured data markup for paywalled content on AEM as a Cloud Service, and I’d like to add a verification mechanism to ensure that requests claiming to be from Googlebot are legitimate (i.e., through reverse DNS IP verification as described in Google's documentation https://mfpchiesi.atlassian.net/browse/MP-3153  https://developers.google.com/search/docs/crawling-indexing/verifying-googlebot?hl=en ).

Before going further, I wanted to ask:

  1. Has anyone implemented IP verification for Googlebot within AEM as a Cloud Service?

  2. Is it necessary to configure this in collaboration with Dispatcher rules, or can it be handled entirely within AEM?

  3. If both systems (Dispatcher + AEM) are involved, what’s the best practice to ensure smooth communication between them? For example, how would you share or pass verification results between the dispatcher layer and AEM?

Any examples, guidance, or lessons learned would be greatly appreciated!

Thanks in advance,
Adriana

Best answer by Pallavi_Shukla_

There’s no out-of-the-box AEM module for Googlebot verification, but it can be custom-implemented using a Sling Filter or Servlet Filter in AEM.

The solution works as follows:

1. Intercept Targeted Requests
The filter applies to paths such as:

`/content/...`
Structured data endpoints (e.g., `/page.structure-data.json`)

2. Perform Reverse DNS Lookup

Extract the client IP from `request.getRemoteAddr()`
Do a reverse DNS lookup to get the hostname
Check if the hostname ends with:

`.googlebot.com`
`.google.com`

3. Perform Forward DNS Lookup

Resolve the hostname obtained above
Verify that the original IP is one of the resolved addresses

4. Flag the Request

If both checks pass, mark the request with a flag:

request.setAttribute("isVerified", true)
Otherwise, set it to `false`

This flag can then be used in downstream logic — such as structured data components — to control what’s exposed to legitimate bots vs regular users.

1 reply

Pallavi_Shukla_
Community Advisor
Pallavi_Shukla_Community AdvisorAccepted solution
Community Advisor
June 17, 2025

There’s no out-of-the-box AEM module for Googlebot verification, but it can be custom-implemented using a Sling Filter or Servlet Filter in AEM.

The solution works as follows:

1. Intercept Targeted Requests
The filter applies to paths such as:

`/content/...`
Structured data endpoints (e.g., `/page.structure-data.json`)

2. Perform Reverse DNS Lookup

Extract the client IP from `request.getRemoteAddr()`
Do a reverse DNS lookup to get the hostname
Check if the hostname ends with:

`.googlebot.com`
`.google.com`

3. Perform Forward DNS Lookup

Resolve the hostname obtained above
Verify that the original IP is one of the resolved addresses

4. Flag the Request

If both checks pass, mark the request with a flag:

request.setAttribute("isVerified", true)
Otherwise, set it to `false`

This flag can then be used in downstream logic — such as structured data components — to control what’s exposed to legitimate bots vs regular users.