Expand my Community achievements bar.

SOLVED

AEM author standby stuck in initializing state

Avatar

Level 3

Hello,

I am trying to install an AEM 6.5.10 author in standby mode according to
https://experienceleague.adobe.com/docs/experience-manager-65/deploying/deploying/tarmk-cold-standby...

From a working fresh installation, I added the install.standby config files, changed runmode to standby and the instance never completely starts.
from jmx console I see the status at org.apache.jackrabbit.oak: Status ("Standby")

always stuck at "initializing"

in the debug log tarmk-coldstandby.log   I see no errors
and the message
13.10.2021 12:30:27.636 *INFO* [FelixStartLevel] org.apache.jackrabbit.oak.segment.SegmentNodeStoreService Primary SegmentNodeStore initialized

no evidence of trying to connect to primary.
No errors in error.log too
only standby messages:

13.10.2021 16:29:19.627 *INFO* [OsgiInstallerImpl] org.apache.sling.audit.osgi.installer Installed configuration org.apache.jackrabbit.oak.segment.SegmentNodeStoreService from resource TaskResource(url=fileinstallad09f60708c0fb5aee04cdf46857bff9:/data01/aem65sp10sl/crx-quickstart/install/install.standby/org.apache.jackrabbit.oak.segment.SegmentNodeStoreService.config, entity=config:org.apache.jackrabbit.oak.segment.SegmentNodeStoreService, state=INSTALL, attributes=[org.apache.sling.installer.api.tasks.ResourceTransformer=:52:, service.pid=org.apache.jackrabbit.oak.segment.SegmentNodeStoreService], digest=ad3f6f68af05550167ca525da6d37c53)
13.10.2021 16:29:19.635 *INFO* [OsgiInstallerImpl] org.apache.sling.audit.osgi.installer Installed configuration org.apache.jackrabbit.oak.segment.standby.store.StandbyStoreService from resource TaskResource(url=fileinstallad09f60708c0fb5aee04cdf46857bff9:/data01/aem65sp10sl/crx-quickstart/install/install.standby/org.apache.jackrabbit.oak.segment.standby.store.StandbyStoreService.config, entity=config:org.apache.jackrabbit.oak.segment.standby.store.StandbyStoreService, state=INSTALL, attributes=[org.apache.sling.installer.api.tasks.ResourceTransformer=:52:, service.pid=org.apache.jackrabbit.oak.segment.standby.store.StandbyStoreService], digest=4828214f9eeeaca14032ad5b3c6711fd)
13.10.2021 16:29:41.575 *WARN* [CM Event Dispatcher (Fire ConfigurationEvent: pid=org.apache.jackrabbit.oak.segment.standby.store.StandbyStoreService)] org.apache.sling.commons.scheduler.impl.WhiteboardHandler Ignoring service [java.lang.Runnable] : Property scheduler.period is not of type Long
13.10.2021 16:29:41.577 *INFO* [CM Event Dispatcher (Fire ConfigurationEvent: pid=org.apache.jackrabbit.oak.segment.standby.store.StandbyStoreService)] org.apache.jackrabbit.oak-segment-tar Service [8582, [java.lang.Runnable]] ServiceEvent REGISTERED
13.10.2021 16:50:39.940 *INFO* [FelixStartLevel] org.apache.sling.settings.impl.SlingSettingsServiceImpl Active run modes: [dev, standby, s7connect, crx3, author, samplecontent, crx3tar]
Currently in standby mode.

the only bundle in installed state is  cq-dam-cfm-graphql

 

If I remove the install.standby folder and the standby runmode   the instance starts correctly.

Any hint?

thanks

 

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

Hi Andrea,

 

Just tried with 6.5.10 here, works.

 

Looks like a config got missed somehow, please update config's as per docs and restart your instances again.

 

Upload a file on primary and see if anything populates in logs.


Regards,

Peter

View solution in original post

6 Replies

Avatar

Community Advisor

Hi Andrea,

 

What do you see on primary author in the JMX console, does it know about it's stand by instance?


Regards,

Peter

Avatar

Level 3

Hi,

the primary looks fine

AndreaB69_0-1634140721866.png

also testing connectivity (telnet on 8023 port  )  is ok

Avatar

Correct answer by
Community Advisor

Hi Andrea,

 

Just tried with 6.5.10 here, works.

 

Looks like a config got missed somehow, please update config's as per docs and restart your instances again.

 

Upload a file on primary and see if anything populates in logs.


Regards,

Peter

Avatar

Community Advisor

Hi @AndreaB69 ,

         Can you please try using latest service pack.

Regards,

Sanjay

Avatar

Level 3

Hi

alas I am already at the latest ( service pack 10 )

thanks

Andrea

Avatar

Level 1

I've had the same problem. It looks like standby sync got broken in oak-segment-tar 1.22.9 (available in SP11) or even earlier version and then fixed in 1.22.10. Because of an earlier change in how the config was handled, int property could not be casted into long and the synchronization jobs were never run on AEM Standby instance. Therefore, JMX showed the Standby process as "Initializing". This problem is visible in logs too:

13.10.2021 16:29:41.575 *WARN* [CM Event Dispatcher (Fire ConfigurationEvent: pid=org.apache.jackrabbit.oak.segment.standby.store.StandbyStoreService)] org.apache.sling.commons.scheduler.impl.WhiteboardHandler Ignoring service [java.lang.Runnable] : Property scheduler.period is not of type Long


Fortunately, this bug has been fixed along with other bug in this commit: https://github.com/apache/jackrabbit-oak/commit/1e9cefbb79c1719d9dc1b656a9263b21cd68afaf#diff-45f1fb... and installing oak-segment-tar in version 1.22.10 resolves the issue.