Skip to content

AlertmanagerClusterDown

Meaning

Half or more of the Alertmanager instances within the same cluster are down.

Impact

You have an unstable cluster, if everything goes wrong you will lose the whole cluster.

Diagnosis

Verify why pods are not running. You can get a big picture with events.

$ kubectl get events --field-selector involvedObject.kind=Pod | grep alertmanager

Mitigation

There are no cheap options to mitigate this risk. Verifying any new changes in preprod before production environment should improve stability.