12.4. 验证
要测试 Prometheus 警报是否会如预期触发 Webhook,请执行以下步骤来模拟脑裂:
在每个集群中执行以下操作:
命令:
oc -n openshift-operators scale --replicas=0 deployment/infinispan-operator-controller-manager oc -n openshift-operators rollout status -w deployment/infinispan-operator-controller-manager oc -n ${NAMESPACE} scale --replicas=0 deployment/infinispan-router oc -n ${NAMESPACE} rollout status -w deployment/infinispan-routeroc -n openshift-operators scale --replicas=0 deployment/infinispan-operator-controller-manager1 oc -n openshift-operators rollout status -w deployment/infinispan-operator-controller-manager oc -n ${NAMESPACE} scale --replicas=0 deployment/infinispan-router2 oc -n ${NAMESPACE} rollout status -w deployment/infinispan-routerCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
通过检查 Openshift 控制台中的 Observe
Alerting 菜单来验证集群中是否已触发 SiteOffline事件 - 检查 AWS 控制台中的 Global Accelerator EndpointGroup,且只有一个端点
扩展 Data Grid Operator 和 Gossip Router,以在站点间重新建立连接:
命令:
oc -n openshift-operators scale --replicas=1 deployment/infinispan-operator-controller-manager oc -n openshift-operators rollout status -w deployment/infinispan-operator-controller-manager oc -n ${NAMESPACE} scale --replicas=1 deployment/infinispan-router oc -n ${NAMESPACE} rollout status -w deployment/infinispan-routeroc -n openshift-operators scale --replicas=1 deployment/infinispan-operator-controller-manager oc -n openshift-operators rollout status -w deployment/infinispan-operator-controller-manager oc -n ${NAMESPACE} scale --replicas=1 deployment/infinispan-router1 oc -n ${NAMESPACE} rollout status -w deployment/infinispan-routerCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- 将
${NAMESPACE}替换为包含 Data Grid 服务器的命名空间
-
检查每个站点中的
vendor_jgroups_site_view_status指标。值1表示站点可以访问。 - 更新加速器 EndpointGroup,使其包含两个端点。详情请参阅 在线品牌品牌 章节。