AEM Cloud Pipeline Deployment Error | Community
Skip to main content
Rohan_Garg
Community Advisor
Community Advisor
November 28, 2024
Solved

AEM Cloud Pipeline Deployment Error

  • November 28, 2024
  • 5 replies
  • 3349 views

Hi Everyone,

 

I am getting a weird error when deploying a branch to my Prod Pipeline. The same issue does not come up when I deploy it to my Non-Prod pipeline.

The Prod Pipeline fails at the "Deploy to Stage" when it is "Installing Mutable Content" as seen below - 

I downloaded the build logs to look for any particular error but was unable to find anything definitive - 

2024-11-28T06:35:19+0000 Install mutable content job has started.
2024-11-28T06:47:12+0000 Install mutable content job has failed.
2024-11-28T06:47:13+0000 Failed deployment

 

===== source=installMutableContent.log =====

==================== container=main, pod=cm-pyyyyy-eyyyyyy-aem-mutable-content-wrapper-5050059-vcrkc ====================

2024-11-28T06:35:24+00:00 [SkylineProxyJob] Waiting until job is started
2024-11-28T06:35:24+00:00 [SkylineProxyJob] Waiting 1800s for job state Progressing|Failed|Complete.
2024-11-28T06:35:25+00:00 [SkylineProxyJob] Unexpected state: .
2024-11-28T06:35:25+00:00 [SkylineProxyJob] The wrapper job failed. The skylineenvironment/cm-pyyyyy-eyyyyyy status is:

{
"conditions": [
{
"lastTransitionTime": "2024-11-28T05:56:59Z",
"message": "All deployments are up",
"status": "True",
"type": "DeploymentsCompleted"
},
...

 

From the logs I could infer the below - 

Wrapper Job Failed:

  • The SkylineProxyJob encountered an unexpected state and is marked as failed. It was waiting for the job to reach one of the following states: Progressing, Failed, or Complete, but encountered an issue.

SkylineProxy I know serves as an intermediary service to manage and track execution of jobs in the cloud environment.

But why it would not fail on a non-production pipeline and fail on production pipeline is confusing.

 

We have raised an Adobe ticket but would appreciate if anyone has any leads or pointers on this.

@arunpatidar@estebanbustamante@aanchal-sikka@tethich@daniel-strmecki 

 

Thanks,

Rohan Garg

 

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by Rohan_Garg

Hi Everyone,

 

Adobe did an RCA on this and provided the below reason-

The program's environment was under hibernation when the deployment was triggered.

Ideally a deployment should de-hibernate the environment but apparently that's not the case currently.

 

This hibernation only occurs on the sandbox environment and hence will not be replicable on a non-sandbox environment.

Thanks for the support!

 

Regards,

Rohan Garg

5 replies

arunpatidar
Community Advisor
Community Advisor
November 28, 2024

Hi @rohan_garg 

Can you check the error.logs on AEM instances, It could be due to AEM packages are not installed properly. This is just a pure guess.

Meanwhile raise a ticket with Adobe as well.

Arun Patidar
Rohan_Garg
Community Advisor
Community Advisor
November 29, 2024

Hey @arunpatidar, it is not due to the packages not getting installed properly.

Adobe is investigating the deployment logs. Thanks for the suggestion!

narendiran_ravi
Level 6
November 28, 2024

Sometimes, if the distribution queue is blocked for a longer time, you might see this issue. Please check if that is the case

Rohan_Garg
Community Advisor
Community Advisor
November 29, 2024

Hey @narendiran_ravi, It is not the distribution queue that is blocked - We validated it - The queue is idle and was able to test connection and also able to push content update successfully!

Thanks for the suggestion!

aanchal-sikka
Community Advisor
Community Advisor
November 28, 2024

@rohan_garg 

 

Often these are environment specific issues. Example: a parent node might not be of expected type. 

 

If the logs are not displaying details, you can try installing mutable content package manually. The Package manager should be able to list the issue.

 

Adobe should have access to more details on deployment failure, so you can even wait for their response.

 

 

 

Aanchal Sikka
Rohan_Garg
Community Advisor
Community Advisor
November 29, 2024

Hey @aanchal-sikka, It is yes an environment specific issue.

I will try installing the mutable content manually once but weird that this issue would occur only on Prod pipeline!

Thanks for the suggestion!

A_H_M_Imrul
Community Advisor
Community Advisor
November 28, 2024

Hello @rohan_garg,

Please follow the section: Including /var in content package under debugging cloud manager deployment: https://experienceleague.adobe.com/en/docs/experience-manager-learn/cloud-service/debugging/debugging-aem-as-a-cloud-service/build-and-deployment#including-var-in-content-package

specially the resolution section.

Thanks

 

Rohan_Garg
Community Advisor
Community Advisor
November 29, 2024

Hey @a_h_m_imrul
We do have the filter rule in ui.content's filter.xml

<filter root="/var/workflow/models" mode="merge"/>

But the node structures are indeed defined using repoint so hence shouldn't be a problem.

And again this problem should have occurred on non-prod pipeline too considering it undergoes a daily deployment while the production pipeline goes on a bi-weekly basis.

Thanks for the suggestion!

A_H_M_Imrul
Community Advisor
Community Advisor
November 29, 2024

@rohan_garg, it makes sense what you are saying but please make sure you did not miss it: https://helpx.adobe.com/in/experience-manager/kb/cm/cloudmanager-deploy-fails-due-to-sling-distribution-aem.html. it talks about a service user: sling-distribution-importer with the mentioned permission through repoinit script. 

it's the 4th point from the resolutions I shared previously.

Please let me know your findings.

Thanks  

Rohan_Garg
Community Advisor
Rohan_GargCommunity AdvisorAuthorAccepted solution
Community Advisor
December 6, 2024

Hi Everyone,

 

Adobe did an RCA on this and provided the below reason-

The program's environment was under hibernation when the deployment was triggered.

Ideally a deployment should de-hibernate the environment but apparently that's not the case currently.

 

This hibernation only occurs on the sandbox environment and hence will not be replicable on a non-sandbox environment.

Thanks for the support!

 

Regards,

Rohan Garg