Expand my Community achievements bar.

How to detect when service is down?

Avatar

Level 2

During the LCCS service outage this morning, when my app was launched it seemed to just hang on the service login (saw same behavior when I logged into my developer portal account).

I am currently checking for exception generation around the session login call and have a callback registered for the error event. Is there something else that I should have in place which would have been able to catch this so that I can display something meaningful to my users?

Thanks

9 Replies

Avatar

Former Community Member

Hi ,

Actually catching an exception on login is a good option. So, all you can do is when you catch the exception, you can show some login error message related to server. Since mostly login error could due to wrong username/password/roomurl from user, So you need to word the error message in a generic manner accordingly.

Thanks

Hironmay Basu

Avatar

Level 2

I am currently coded to catch any exception and display a rather generic message, but none seem to be generated in this case...

-Chris

BTW: absolutely love this stuff. We are getting ready to launch our first app today!

Avatar

Employee

Sorry, this was a particular strange case where the system was just "stuck".

We will fix this to return errors when this particular situation occurs (and of course we'll also work on fixing the root cause so that this doesn't happen again)

Avatar

Level 2

Cool - you guys are great! Keep up the good work

-Chris

Avatar

Level 2

OK - another request related to the status of the LCCS service. I have been using health.acrobat.com and watching the corresponding rss feed in Google Reader to keep track of scheduled maintenance events and service outages.

I am trying to track down an older log entry where a call to keepalive on the Ruby AccountManager object generated a 403 which was logged on our side yesterday, Thursday 2/18 around 4:40pm PST. I am trying to figure out what could have caused the problem, but didn't see anything from the rss feed indicating that there were any recorded problems yesterday.

(1) Is there another resource that I can be checking to help in troubleshooting these sort of errors when we are unable to make connections to the service?

Thanks - Chris

Avatar

Employee

I don't think this error resulted from the service being down, since I do return 403 on some requests (and we don't have any other reports of anomalies for yesterday)

My guess is that your keepalive was "late" and the authentication token was already expired (and I'll try to reproduce this case and possibly update the keepalive method to catch the error correctly)

If you can send me your account URL and the approximate time of failure I can check our logs and verify this hypothesis.

Avatar

Level 2

Thanks - from looking at the keep alive code documentation we thought that if we were passing our credentials along that even if we are "late" it would re-authenticate us.

Although you are correct in that do_initialize does raise some errors that are not handled by the keepalive method and these are what we are logging at a higher level. I am going to change our copy of the AFCS ruby sample to handle these errors within the keepalive method and retry the authentication instead of passing the error through.

-Chris

Avatar

Employee

Cool, let me know how it works and I'll update the "official" scripts too.

Avatar

Level 2

I updated AccountManager.keepalive to capture any errors raised by the underlying call to do_initialize and return the value from trying to login again if a username is supplied. Otherwise I just return false.

Should catch the case that I was seeing, but time will tell... Have to see if happen again

Thanks