Expand my Community achievements bar.

Dive into Adobe Summit 2024! Explore curated list of AEM sessions & labs, register, connect with experts, ask questions, engage, and share insights. Don't miss the excitement.
SOLVED

Replication feature is breaking our instances (Package Manager)

Avatar

Level 3

Hi guys,

We have a problem in which in the exact moment we click to replicate our `apps` package, inside Package Manager, our instances all of them (author and publish) they break, they simply go down. They start returning http code 503 for everything, we can't even access CRXDE or Package Manager.

I don't know if the http code is real, because it is being returned by Amazon ELB, there is a probability ELB is changing the original http code (our instances are under ELB).

We have done this replication thing in local machines and other official Adobe envs, and only in an specific env we have this problem.

Does anyone had this problem before?

Thanks,

Fernando Uchiyama

1 Accepted Solution

Avatar

Correct answer by
Level 3

Hi guys, we have opened a ticket on day care, the problem is solved now.

Let me elaborate it here, so it can help others.

The real problem was caused by a feature of ACS Commons called Versioned Clientlibs, that we have started using and is ready to be deployed.

ACS Commons was installed on our `author` instance correctly, so we didn't have problems there. When whe replicated our apps package to the `publish` instance, the instance went down because ACS Commons was missing.

What we couldn't understand yet is why `author` was gone if the problem was on `publish`. It may be a problem related to Amazon ELB.

We have replicated ACS Commons accessing author instance directly using IP address.

Now everything is working fine.

View solution in original post

6 Replies

Avatar

Employee

How big is the package with "apps"?

Does the rest of replications work, i.e can you replicate a page or a DAM asset?

Avatar

Level 3

This package is very small, it has something between 2~3 MB.

Our `content` package is bigger, it has now like around 40MB (but the problem happens when trying to replicate the `apps` package).

Avatar

Level 10

I am checking internally to see if this is a known issue. If you do not hear back, i recommend opening a support ticket. There is a bug somewhere.

Avatar

Level 3

It is something related to dispatcher.

If we access the IP address directly, it is working.

Avatar

Employee Advisor

Well, I would say that this is kind of behaviour I would expect.

Let me elaborate. In your apps package you typically have a lot of stuff, templates, components, bundles. When you install them on a system, you can see in the logs that a lot of activity is going to start. Services are restarting, dependencies are re-wired, caches rebuilt etc; you might have moments where the rendering is simply not working. You are deploying your application so it's kind of expected behavior. But this is not the core of your problem.

The problem is that you do that on all publishs at the same time. When you replicate that package to all publish instances, the package arrives there at nearly the same time. So you bring down all your publish instances at the same time. And during that time the ELB cannot reach a working publish instance and returns a 503.

You need to establish a blue-green style of deployment for your publish instances. Split your publish instances in 2 distinct sets and then deploy each set at once (and leave the other one unaffected by this). That isn't that easy as hitting "replicate" but requires more work or automation. But it's definitely worth to invest time in there.

Jörg

Avatar

Correct answer by
Level 3

Hi guys, we have opened a ticket on day care, the problem is solved now.

Let me elaborate it here, so it can help others.

The real problem was caused by a feature of ACS Commons called Versioned Clientlibs, that we have started using and is ready to be deployed.

ACS Commons was installed on our `author` instance correctly, so we didn't have problems there. When whe replicated our apps package to the `publish` instance, the instance went down because ACS Commons was missing.

What we couldn't understand yet is why `author` was gone if the problem was on `publish`. It may be a problem related to Amazon ELB.

We have replicated ACS Commons accessing author instance directly using IP address.

Now everything is working fine.