13.8. JGroups Troubleshooting

13.8.1. Nodes do not form a cluster
复制链接

Make sure your machine is set up correctly for IP multicast. There are 2 test programs that can be used to detect this: McastReceiverTest and McastSenderTest. For example:

java -cp EAP_HOME/bin/client/jboss-client.jar org.jgroups.tests.McastReceiverTest -mcast_addr 230.11.11.11 -port 5555 -bind_addr YOUR_BIND_ADDRESS

java -cp EAP_HOME/bin/client/jboss-client.jar org.jgroups.tests.McastReceiverTest -mcast_addr 230.11.11.11 -port 5555 -bind_addr YOUR_BIND_ADDRESSEAP_HOME/bin/client/jboss-client.jar org.jgroups.tests.McastReceiverTest -mcast_addr 230.11.11.11 -port 5555 -bind_addr YOUR_BIND_ADDRESSEAP_HOME/bin/client/jboss-client.jar org.jgroups.tests.McastReceiverTest -mcast_addr 230.11.11.11 -port 5555 -bind_addr YOUR_BIND_ADDRESS

Copy to Clipboard

Toggle word wrap

Then in another window start McastSenderTest:

java -cp EAP_HOME/bin/client/jboss-client.jar org.jgroups.tests.McastSenderTest -mcast_addr 230.11.11.11 -port 5555 -bind_addr YOUR_BIND_ADDRESS

java -cp EAP_HOME/bin/client/jboss-client.jar org.jgroups.tests.McastSenderTest -mcast_addr 230.11.11.11 -port 5555 -bind_addr YOUR_BIND_ADDRESSEAP_HOME/bin/client/jboss-client.jar org.jgroups.tests.McastSenderTest -mcast_addr 230.11.11.11 -port 5555 -bind_addr YOUR_BIND_ADDRESSEAP_HOME/bin/client/jboss-client.jar org.jgroups.tests.McastSenderTest -mcast_addr 230.11.11.11 -port 5555 -bind_addr YOUR_BIND_ADDRESS

Copy to Clipboard

Toggle word wrap

If you want to bind to a specific network interface card (NIC), use -bind_addr 192.168.0.2, where 192.168.0.2 is the IP address of the NIC to which you want to bind. Use this parameter in both the sender and the receiver.

You should be able to type in the McastSenderTest window and see the output in the McastReceiverTest window. If not, try to use -ttl 32 in the sender. If this still fails, consult a system administrator to help you setup IP multicast correctly, and ask the admin to make sure that multicast will work on the interface you have chosen or, if the machines have multiple interfaces, ask to be told the correct interface. Once you know multicast is working properly on each machine in your cluster, you can repeat the above test to test the network, putting the sender on one machine and the receiver on another.

Report a bug

13.8.2. Causes of missing heartbeats in FD
复制链接

Sometimes a member is suspected by FD because a heartbeat ack has not been received for some time T (defined by timeout and max_tries). This can have multiple reasons, e.g. in a cluster of A,B,C,D; C can be suspected if (note that A pings B, B pings C, C pings D and D pings A):

B or C are running at 100% CPU for more than T seconds. So even if C sends a heartbeat ack to B, B may not be able to process it because it is at 100%
B or C are garbage collecting, same as above.
A combination of the 2 cases above
The network loses packets. This usually happens when there is a lot of traffic on the network, and the switch starts dropping packets (usually broadcasts first, then IP multicasts, TCP packets last).
B or C are processing a callback. Let us say C received a remote method call over its channel and takes T+1 seconds to process it. During this time, C will not process any other messages, including heartbeats, and therefore B will not receive the heartbeat ack and will suspect C.

Report a bug

返回顶部

此内容没有您所选择的语言版本。

13.8. JGroups Troubleshooting

13.8.1. Nodes do not form a cluster
复制链接

13.8.2. Causes of missing heartbeats in FD
复制链接

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

此内容没有您所选择的语言版本。

13.8. JGroups Troubleshooting

13.8.1. Nodes do not form a cluster复制链接链接已复制到粘贴板!

13.8.2. Causes of missing heartbeats in FD复制链接链接已复制到粘贴板!

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

13.8.1. Nodes do not form a cluster
复制链接

13.8.2. Causes of missing heartbeats in FD
复制链接