Expand my Community achievements bar.

pbcast warning in the server logs

Avatar

Level 1

Hi,

I've been facing a situation where the server log suddenly started throwing the following server errors from the morning:

2013-10-13 08:32:12,775 WARN  [org.jgroups.protocols.pbcast.NAKACK] 10.13.32.43:65517] discarded message from non-member 10.13.32.43:56954, my view is [10.13.32.43:65517|2] [10.13.32.43:65517]

2013-10-13 08:32:33,333 WARN  [org.jgroups.protocols.pbcast.NAKACK] 10.13.32.43:65517] discarded message from non-member 10.13.32.43:56954, my view is [10.13.32.43:65517|2] [10.13.32.43:65517]

2013-10-13 08:32:33,333 WARN  [org.jgroups.protocols.pbcast.NAKACK] 10.13.32.43:65517] discarded message from non-member 10.13.32.43:56954, my view is [10.13.32.43:65517|2] [10.13.32.43:65517]

2013-10-13 08:32:12,715 WARN  [org.jgroups.protocols.pbcast.NAKACK] 10.13.32.43:56954] discarded message from non-member 10.13.32.43:65517, my view is [10.13.32.43:56954|0] [10.13.32.43:56954]

2013-10-13 08:32:40,745 WARN  [org.jgroups.protocols.pbcast.NAKACK] 10.13.32.43:56954] discarded message from non-member 10.13.32.43:65517, my view is [10.13.32.43:56954|0] [10.13.32.43:56954]

2013-10-13 08:32:40,922 WARN  [org.jgroups.protocols.pbcast.NAKACK] 10.13.32.43:56954] discarded message from non-member 10.13.32.43:65517, my view is [10.13.32.43:56954|0] [10.13.32.43:56954]

When tracked back, the issue appeared after the server restart which started off with the following errors:

2013-10-11 23:44:50,263 WARN  [org.jgroups.protocols.FD] I was suspected by 10.13.32.43:65517; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK

2013-10-11 23:44:55,095 WARN  [org.jgroups.protocols.FD] ping_dest is null: members=[10.13.32.43:65514, 10.13.32.43:65517], pingable_mbrs=[10.13.32.43:65517], local_addr=10.13.32.43:65517

2013-10-11 23:44:56,677 WARN  [org.jgroups.protocols.FD] I was suspected by 10.13.32.43:65517; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK

2013-10-11 23:45:02,133 WARN  [org.jgroups.protocols.FD] I was suspected by 10.13.32.43:65517; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK

2013-10-11 23:45:09,471 WARN  [org.jgroups.protocols.pbcast.GMS] failed to collect all ACKs (1) for view [10.13.32.43:65517|2] [10.13.32.43:65517] after 2000ms, missing ACKs from [10.13.32.43:65517] (received=[10.13.32.43:65514]), local_addr=10.13.32.43:65517

2013-10-11 23:45:10,958 WARN  [org.jgroups.protocols.pbcast.GMS] I (10.13.32.43:65514) am not a member of view [10.13.32.43:65517|2] [10.13.32.43:65517], shunning myself and leaving the group (prev_members are [10.13.32.43:65514 10.13.32.43:65517 ], current view is [10.13.32.43:65514|1] [10.13.32.43:65514, 10.13.32.43:65517])

2013-10-11 23:45:20,632 WARN  [org.jgroups.protocols.pbcast.NAKACK] 10.13.32.43:65517] discarded message from non-member 10.13.32.43:65514, my view is [10.13.32.43:65517|2] [10.13.32.43:65517]

2013-10-11 23:45:20,989 WARN  [org.jgroups.protocols.FD] I was suspected by 10.13.32.43:65517; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK

2013-10-11 23:45:21,192 WARN  [org.jgroups.protocols.FD] I was suspected by 10.13.32.43:65517; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK

2013-10-11 23:45:21,395 WARN  [org.jgroups.protocols.pbcast.NAKACK] 10.13.32.43:65514] discarded message from non-member 10.13.32.43:65517, my view is [10.13.32.43:65514|1] [10.13.32.43:65514, 10.13.32.43:65517]

2013-10-11 23:45:21,192 ERROR [org.jgroups.protocols.pbcast.NAKACK] sender 10.13.32.43:65514 not found in received_msgs

2013-10-11 23:45:27,698 ERROR [org.jgroups.protocols.pbcast.NAKACK] range is null

2013-10-11 23:45:27,698 ERROR [org.jgroups.protocols.pbcast.NAKACK] sender 10.13.32.43:65517 not found in received_msgs

2013-10-11 23:45:27,888 ERROR [org.jgroups.protocols.pbcast.NAKACK] range is null

2013-10-11 23:45:27,888 WARN  [org.jgroups.protocols.VIEW_ENFORCER] exception: QueueClosedException

2013-10-11 23:45:52,667 ERROR [org.jgroups.protocols.pbcast.STABLE] down_handler thread for STABLE was interrupted (in order to be terminated), but is is still alive

2013-10-11 23:46:16,041 ERROR [com.adobe.livecycle.events.RemoteEventThread]

Surprisingly, we never attempted to configure a cluster and we are having a standalone single server environment.

Any one faced this issue earlier?

Any key pointers on how to get rid of it?

Thanks.

2 Replies

Avatar

Level 3

It seems that there are two ports involved (65514 and 65517) here on same machine (10.13.32.43).

So if its a standalone server, you need to check which process is running on these ports and then may be you can get some pointers.

~ Varun

Avatar

Level 1

Thanks Varun.

However, my concern is that if I have configured it to be a single server standalone environment, why is it adding the ports as a cluster member to the configuration?