AEM author standby stuck in initializing state | Community
Skip to main content
AndreaB69
Level 3
October 13, 2021
Solved

AEM author standby stuck in initializing state

  • October 13, 2021
  • 4 replies
  • 2763 views

Hello,

I am trying to install an AEM 6.5.10 author in standby mode according to
https://experienceleague.adobe.com/docs/experience-manager-65/deploying/deploying/tarmk-cold-standby.html

From a working fresh installation, I added the install.standby config files, changed runmode to standby and the instance never completely starts.
from jmx console I see the status at org.apache.jackrabbit.oak: Status ("Standby")

always stuck at "initializing"

in the debug log tarmk-coldstandby.log   I see no errors
and the message
13.10.2021 12:30:27.636 *INFO* [FelixStartLevel] org.apache.jackrabbit.oak.segment.SegmentNodeStoreService Primary SegmentNodeStore initialized

no evidence of trying to connect to primary.
No errors in error.log too
only standby messages:

13.10.2021 16:29:19.627 *INFO* [OsgiInstallerImpl] org.apache.sling.audit.osgi.installer Installed configuration org.apache.jackrabbit.oak.segment.SegmentNodeStoreService from resource TaskResource(url=fileinstallad09f60708c0fb5aee04cdf46857bff9:/data01/aem65sp10sl/crx-quickstart/install/install.standby/org.apache.jackrabbit.oak.segment.SegmentNodeStoreService.config, entity=config:org.apache.jackrabbit.oak.segment.SegmentNodeStoreService, state=INSTALL, attributes=[org.apache.sling.installer.api.tasks.ResourceTransformer=:52:, service.pid=org.apache.jackrabbit.oak.segment.SegmentNodeStoreService], digest=ad3f6f68af05550167ca525da6d37c53)
13.10.2021 16:29:19.635 *INFO* [OsgiInstallerImpl] org.apache.sling.audit.osgi.installer Installed configuration org.apache.jackrabbit.oak.segment.standby.store.StandbyStoreService from resource TaskResource(url=fileinstallad09f60708c0fb5aee04cdf46857bff9:/data01/aem65sp10sl/crx-quickstart/install/install.standby/org.apache.jackrabbit.oak.segment.standby.store.StandbyStoreService.config, entity=config:org.apache.jackrabbit.oak.segment.standby.store.StandbyStoreService, state=INSTALL, attributes=[org.apache.sling.installer.api.tasks.ResourceTransformer=:52:, service.pid=org.apache.jackrabbit.oak.segment.standby.store.StandbyStoreService], digest=4828214f9eeeaca14032ad5b3c6711fd)
13.10.2021 16:29:41.575 *WARN* [CM Event Dispatcher (Fire ConfigurationEvent: pid=org.apache.jackrabbit.oak.segment.standby.store.StandbyStoreService)] org.apache.sling.commons.scheduler.impl.WhiteboardHandler Ignoring service [java.lang.Runnable] : Property scheduler.period is not of type Long
13.10.2021 16:29:41.577 *INFO* [CM Event Dispatcher (Fire ConfigurationEvent: pid=org.apache.jackrabbit.oak.segment.standby.store.StandbyStoreService)] org.apache.jackrabbit.oak-segment-tar Service [8582, [java.lang.Runnable]] ServiceEvent REGISTERED
13.10.2021 16:50:39.940 *INFO* [FelixStartLevel] org.apache.sling.settings.impl.SlingSettingsServiceImpl Active run modes: [dev, standby, s7connect, crx3, author, samplecontent, crx3tar]
Currently in standby mode.

the only bundle in installed state is  cq-dam-cfm-graphql

 

If I remove the install.standby folder and the standby runmode   the instance starts correctly.

Any hint?

thanks

 

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by Peter_Puzanovs

Hi Andrea,

 

Just tried with 6.5.10 here, works.

 

Looks like a config got missed somehow, please update config's as per docs and restart your instances again.

 

Upload a file on primary and see if anything populates in logs.


Regards,

Peter

4 replies

Peter_Puzanovs
Community Advisor
Community Advisor
October 13, 2021

Hi Andrea,

 

What do you see on primary author in the JMX console, does it know about it's stand by instance?


Regards,

Peter

AndreaB69
AndreaB69Author
Level 3
October 13, 2021

Hi,

the primary looks fine

also testing connectivity (telnet on 8023 port  )  is ok

Peter_Puzanovs
Community Advisor
Peter_PuzanovsCommunity AdvisorAccepted solution
Community Advisor
October 13, 2021

Hi Andrea,

 

Just tried with 6.5.10 here, works.

 

Looks like a config got missed somehow, please update config's as per docs and restart your instances again.

 

Upload a file on primary and see if anything populates in logs.


Regards,

Peter

Sanjay_Bangar
Community Advisor
Community Advisor
October 18, 2021

Hi @andreab69 ,

         Can you please try using latest service pack.

Regards,

Sanjay

AndreaB69
AndreaB69Author
Level 3
October 19, 2021

Hi

alas I am already at the latest ( service pack 10 )

thanks

Andrea

MaciejMajchrzak
February 3, 2022

I've had the same problem. It looks like standby sync got broken in oak-segment-tar 1.22.9 (available in SP11) or even earlier version and then fixed in 1.22.10. Because of an earlier change in how the config was handled, int property could not be casted into long and the synchronization jobs were never run on AEM Standby instance. Therefore, JMX showed the Standby process as "Initializing". This problem is visible in logs too:

13.10.2021 16:29:41.575 *WARN* [CM Event Dispatcher (Fire ConfigurationEvent: pid=org.apache.jackrabbit.oak.segment.standby.store.StandbyStoreService)] org.apache.sling.commons.scheduler.impl.WhiteboardHandler Ignoring service [java.lang.Runnable] : Property scheduler.period is not of type Long


Fortunately, this bug has been fixed along with other bug in this commit: https://github.com/apache/jackrabbit-oak/commit/1e9cefbb79c1719d9dc1b656a9263b21cd68afaf#diff-45f1fb13aff830b1c631d6063d89925201b9f4f2d47b1b5318d991d2b483f8f1 and installing oak-segment-tar in version 1.22.10 resolves the issue.