Ideas to resolve HTTP ERROR 503 AuthenticationSupport service missing. Cannot authenticate request error by <Sudheer Sundalam>
Problem Statement: Our Integration AEM Author server which is hosted on Linux CentOS was displaying “HTTP ERROR 503 AuthenticationSupport service missing. Cannot authenticate request” error while accessing any page including Login page, CRXDe or Package Manager URLs.
Background: We had run the bulk workflow feature from ACS Commons to migrate all our assets to Dynamic Media Scene 7.
The bulk workflow completed successfully for most of the assets but also size of the repository increased by 2x. The server ended up with low disk space issue.
We suspect it also caused the Java Out of Memory errors in the logs (but couldn’t validate it as we delete the logs to free up some space on the disk)
Resolution Steps performed:
We requested our IT team to add more disk space to this server and they increased the disk space by 100GB. I restarted the AEM Author server to see if that fixed anything. It fixed nothing. The server starts few of the services and shows the following error in the error log after that.
“*ERROR* [qtp194923719-62] org.apache.sling.engine.impl.SlingHttpContext handleSecurity: AuthenticationSupport service missing. Cannot authenticate request.”
“*ERROR* [qtp194923719-62] org.apache.sling.engine.impl.SlingHttpContext handleSecurity: Possible reason is missing Repository service. Check AuthenticationSupport dependencies.”
In order to fix the AEM Author server, I started to follow the resolution steps from various community forum posts.
- Stopped the aem server and deleted the “repoinit” folder present at crx-quickstart/launchpad/config/org/apache/sling/jcr/repoinit and started the aem author server. This didn’t change anything. I saw the same error in the logs.
- Stopped the aem server and deleted the crx-quickstart/repository/segmentstore/repo.lock and crx-quickstart/launchpad/felix/cache.lock files and started the server. Saw same error in the error log.
- Stopped the aem server and deleted the ‘index’ folder present at crx-quickstart/repository/index and started the server. Nothing changed and even the indexing didn’t start at all.
As all the above steps failed to start the server, at this point, I decided to try the segmentstore checkup and reverting to last known good segmentstore revision by following the steps very well documented in this AEM Community blog page:
https://helpx.adobe.com/experience-manager/kb/SegmentNotFoundException.html
By following the SegmentNotFoundException blog page, I was able to restore to the last known good revision of the segment store.
- Deleted all the segment entries after the last known good revision entry from the journal.log.
- Removed all the /crx-quickstart/repository/segmentstore/*.bak files
- Deleted ‘repo.lock’ file
- Removed un-referenced checkpoints by running:
- java -jar oak-run-*.jar checkpoints ./crx-quickstart/repository/segmentstore rm-unreferenced.
- Performed the offline compaction using:
- java -jar oak-run-*.jar compact ./crx-quickstart/repository/segmentstore/
- Restarted the server.
Even after performing the segment store revision backup, checkpoint cleanup and offline compaction. I saw the exact same error again in the logs during server start up:
“[qtp194923719-62] org.apache.sling.engine.impl.SlingHttpContext handleSecurity: AuthenticationSupport service missing. Cannot authenticate request.”
“*ERROR* [qtp194923719-62] org.apache.sling.engine.impl.SlingHttpContext handleSecurity: Possible reason is missing Repository service. Check AuthenticationSupport dependencies.”
Exhausted by trying all the options and seeing the exact same error in the logs, I shut the laptop for the day and left.
Troubleshooting continued….
Next day, after restarting the server, instead of accessing aem author server login page/CRXDe/CRX package manager, I tried accessing OSGI “/system/console” page. To my surprise, the browser prompted for “username/password”. I tried entering the “admin/configuredPassword” but it didn’t accept and let me login. I tried entering password in different ways (copy paste and typed letter by letter). It didn’t let me login.
After some time, just for checking, I entered “admin/admin” as username and password. Voila! It let me login and I can see all the bundles in active state including my project bundles. I compared the active bundles count with that of publish server and found 3-4 bundles less on the author vs publish.
At this point, as I gained access to the OSGI console, I tried to run the repository consistency check using the JMX console. So, I went to JMX console at “/system/console/jmx” and searched for “repository” service. Again, to my surprise, I didn’t find the “respository” service in the JMX console. Accessing “/system/console/jmx/com.adobe.granite%3Atype%3DRepository” was going to 404 page.
This confirmed my suspicion of repository not being initialized at all during the server startup.
As this AEM Author server is hosted on Linux Cent OS, I configured it as the System service and always used the service init script to start/stop the server.
Just to try things differently, this time instead of using the init script, as the root user on this server, I tried to start the server by following plain old java command.
- java -Xms256m -Xmx4096m -XX:MaxPermSize=256m -jar aem-author-p4502.jar -Dsling.run.modes=author,crx3,nosamplecontent,dynamicmedia_scene7
The server started normally without any concerning errors. I was able to login via login page, verified the project content etc.,
Also, this time while accessing OSGI Console, I had to enter the “admin/configuredPassword” to login onto the OSGI console instead of “admin/admin”.
I checked repository service via JMX console, and I can see that as well at “/system/console/jmx/com.adobe.granite%3Atype%3DRepository”
This assured me that repository is not corrupt, and it is indeed in a very good condition with all my project content intact.
I checked the logs and saw indexing happening(as I deleted the index folder in earlier steps). While checking the logs, the I saw few “Permission Denied” errors which I never saw before. This one especially caught me thinking.
“*ERROR* [FelixStartLevel] org.apache.jackrabbit.oak-segment-tar bundle org.apache.jackrabbit.oak-segment-tar:1.22.13 (103)[org.apache.jackrabbit.oak.segment.SegmentNodeStoreService(291)] : The activate method has thrown an exception (java.io.FileNotFoundException: /mnt/aem/author/crx-quickstart/repository/segmentstore/journal.log (Permission denied))
java.io.FileNotFoundException: /mnt/aem/author/crx-quickstart/repository/segmentstore/journal.log (Permission denied)”
Permission denied for the journal.log file?? I quickly went into crx-quickstart folder and checked the owner/group permissions on the Linux server.
I saw most of the folders and files permission changed to “root/root”.
I went further into repository folder and checked the permissions. I saw “root/root” as user and group for all the folders under the “repository” folder.
We are not sure how or at what point the folder permissions were set to “root/root”, but I determined that this is the “ROOT” cause based on everything put together.
During the initial server setup, to run the AEM as Linux service, I created ‘aem’ as the service user and used the init script to always start/stop the server.
So, as the folder permissions changed to “root/root” for the user/group, the ‘aem’ system user was unable to access the ‘repository’ folder during the aem server startup using the init script.
I used linux change owner command to set the user/group permission to all the folders and files present under crx-quickstart folder:
This command has set all the folders and files to ‘aem/root’ user/group combination.
After setting the correct folder permissions, I used the init script to start the aem author server and there it is!! The server started normally, and I was able to successfully the resolve the deadly HTTP ERROR 503 AuthenticationSupport service missing. Cannot authenticate request error.
I know this the long post, but wanted to share my experience as many of us are stuck with this error without much success other than restoring a backup. Thanks for reading all the way!
Thanks,
Sudheer Sundalam.
Q&A
Please use this thread to ask questions relating to this article