Currently, Cloud Manager and Splunk dashboards do not provide visibility into leader pod restarts in AEM as a Cloud Service environments. Identifying when and why a leader pod restarts is essential for understanding system stability and troubleshooting potential performance or replication issues.
Having this visibility directly in Cloud Manager (or accessible via Splunk logs) would help operations and development teams proactively monitor environment health, correlate incidents, and improve root-cause analysis without needing Adobe support intervention.
Use-case:
Leader pod restarts can affect replication, workflows, or cache invalidation, but currently there is no easy way for customers to detect or audit these events. Providing this visibility would reduce investigation time during incidents, enhance transparency, and empower teams to maintain stable AEM Cloud environments.
Current/Experienced Behavior:
There is no visibility or alert mechanism in Cloud Manager or Splunk for leader pod restarts. Customers must rely on Adobe Support to confirm such events.
Improved/Expected Behavior:
Display leader pod restart events in Cloud Manager’s monitoring section, with timestamps and pod names.
Alternatively, include this information in the Splunk logs or provide an API endpoint to query pod restart history.
Optional alerting mechanism (email or webhook) when leader pods restart unexpectedly.
Environment Details (AEM version/service pack, any other specifics if applicable):
AEM as a Cloud Service – SDK version 2025.9.22758 (build: 20250928T092442Z)