Expand my Community achievements bar.

SOLVED

AEM Cloud Pipeline Deployment Error

Avatar

Community Advisor

Hi Everyone,

 

I am getting a weird error when deploying a branch to my Prod Pipeline. The same issue does not come up when I deploy it to my Non-Prod pipeline.

The Prod Pipeline fails at the "Deploy to Stage" when it is "Installing Mutable Content" as seen below - 

Rohan_Garg_0-1732777357166.png

I downloaded the build logs to look for any particular error but was unable to find anything definitive - 

2024-11-28T06:35:19+0000 Install mutable content job has started.
2024-11-28T06:47:12+0000 Install mutable content job has failed.
2024-11-28T06:47:13+0000 Failed deployment

 

===== source=installMutableContent.log =====

==================== container=main, pod=cm-pyyyyy-eyyyyyy-aem-mutable-content-wrapper-5050059-vcrkc ====================

2024-11-28T06:35:24+00:00 [SkylineProxyJob] Waiting until job is started
2024-11-28T06:35:24+00:00 [SkylineProxyJob] Waiting 1800s for job state Progressing|Failed|Complete.
2024-11-28T06:35:25+00:00 [SkylineProxyJob] Unexpected state: .
2024-11-28T06:35:25+00:00 [SkylineProxyJob] The wrapper job failed. The skylineenvironment/cm-pyyyyy-eyyyyyy status is:

{
"conditions": [
{
"lastTransitionTime": "2024-11-28T05:56:59Z",
"message": "All deployments are up",
"status": "True",
"type": "DeploymentsCompleted"
},
...

 

From the logs I could infer the below - 

Wrapper Job Failed:

  • The SkylineProxyJob encountered an unexpected state and is marked as failed. It was waiting for the job to reach one of the following states: Progressing, Failed, or Complete, but encountered an issue.

SkylineProxy I know serves as an intermediary service to manage and track execution of jobs in the cloud environment.

But why it would not fail on a non-production pipeline and fail on production pipeline is confusing.

 

We have raised an Adobe ticket but would appreciate if anyone has any leads or pointers on this.

@arunpatidar@EstebanBustamante@aanchal-sikka@Tethich@daniel-strmecki 

 

Thanks,

Rohan Garg

 

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

Hi Everyone,

 

Adobe did an RCA on this and provided the below reason-

The program's environment was under hibernation when the deployment was triggered.

Ideally a deployment should de-hibernate the environment but apparently that's not the case currently.

 

This hibernation only occurs on the sandbox environment and hence will not be replicable on a non-sandbox environment.

Thanks for the support!

 

Regards,

Rohan Garg

View solution in original post

10 Replies

Avatar

Community Advisor

Hi @Rohan_Garg 

Can you check the error.logs on AEM instances, It could be due to AEM packages are not installed properly. This is just a pure guess.

Meanwhile raise a ticket with Adobe as well.



Arun Patidar

Avatar

Community Advisor

Hey @arunpatidar, it is not due to the packages not getting installed properly.

Adobe is investigating the deployment logs. Thanks for the suggestion!

Avatar

Level 6

Sometimes, if the distribution queue is blocked for a longer time, you might see this issue. Please check if that is the case

Avatar

Community Advisor

Hey @narendiran_ravi, It is not the distribution queue that is blocked - We validated it - The queue is idle and was able to test connection and also able to push content update successfully!

Thanks for the suggestion!

Avatar

Community Advisor

@Rohan_Garg 

 

Often these are environment specific issues. Example: a parent node might not be of expected type. 

 

If the logs are not displaying details, you can try installing mutable content package manually. The Package manager should be able to list the issue.

 

Adobe should have access to more details on deployment failure, so you can even wait for their response.

 

 

 


Aanchal Sikka

Avatar

Community Advisor

Hey @aanchal-sikka, It is yes an environment specific issue.

I will try installing the mutable content manually once but weird that this issue would occur only on Prod pipeline!

Thanks for the suggestion!

Avatar

Community Advisor

Hello @Rohan_Garg,

Please follow the section: Including /var in content package under debugging cloud manager deployment: https://experienceleague.adobe.com/en/docs/experience-manager-learn/cloud-service/debugging/debuggin...

specially the resolution section.

Thanks

 

Avatar

Community Advisor

Hey @A_H_M_Imrul
We do have the filter rule in ui.content's filter.xml

<filter root="/var/workflow/models" mode="merge"/>

But the node structures are indeed defined using repoint so hence shouldn't be a problem.

And again this problem should have occurred on non-prod pipeline too considering it undergoes a daily deployment while the production pipeline goes on a bi-weekly basis.

Thanks for the suggestion!

Avatar

Community Advisor

@Rohan_Garg, it makes sense what you are saying but please make sure you did not miss it: https://helpx.adobe.com/in/experience-manager/kb/cm/cloudmanager-deploy-fails-due-to-sling-distribut.... it talks about a service user: sling-distribution-importer with the mentioned permission through repoinit script. 

it's the 4th point from the resolutions I shared previously.

Please let me know your findings.

Thanks  

Avatar

Correct answer by
Community Advisor

Hi Everyone,

 

Adobe did an RCA on this and provided the below reason-

The program's environment was under hibernation when the deployment was triggered.

Ideally a deployment should de-hibernate the environment but apparently that's not the case currently.

 

This hibernation only occurs on the sandbox environment and hence will not be replicable on a non-sandbox environment.

Thanks for the support!

 

Regards,

Rohan Garg