In a previous blog post I wrote about the steps to take to troubleshoot and restore in its working state a Percona database cluster.  This blog article adds some more information.


Finding the most advanced node

You bootstrap a failed cluster by starting the most advanced node first i.e. the node with the most recent copy of the database.  This node can be found by looking at the file /var/lib/mysql/grastate.dat in all nodes:

# GALERA saved state
version: 2.1
uuid: 5ee99582-af8d-11e2-b8e3-23de292c1d30
seqno: 8204504245173

The most advanced node is the one with the highest seqno (sequence number). You can therefore use that node to bootstrap the cluster after deleting the MySQL data directory on the other nodes, as explained previously.

Note that a seqno = -1 is normal in a running, healthy cluster, as this value is changed only when the MySQL daemon stops. However, a stopped cluster with a seqno = -1 on all nodes means that the cluster crashed badly; in this case, the best thing to do is to bootstrap from any node and then perform a restore from the latest backup.


Resetting the quorum

If the command

# service mysql bootstrap-pxc

fails, it might be because the bootstrapped node is not part of the Primary component anymore. The Primary component is the group of nodes that own the quorum – and are therefore authorized to modify the database – in the case of a cluster split. In this case, start the node with the command

# service mysql start

Then force the node as Primary by running this MySQL command:

mysql> SET GLOBAL wsrep_provider_options='pc.bootstrap=true';

EDIT (21/04/2016): On RHEL 7, the command to use to bootstrap a PXC cluster is

# systemctl start mysql@bootstrap.service

It might also happen that one node is running correctly but all ClusterControl recovery jobs fail, because ClusterControl refuses to synchronize the other nodes with the first. Again, in this case it’s a quorum problem. Reset the quorum by forcing the first node as Primary as shown above.

During normal conditions all nodes of a cluster must be Primary. This can be checked easily via the MySQL command

mysql> SHOW GLOBAL STATUS LIKE 'wsrep_cluster_status';
0 réponses

Laisser un commentaire

Participez-vous à la discussion?
N'hésitez pas à contribuer!

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *

Ce site utilise Akismet pour réduire les indésirables. En savoir plus sur comment les données de vos commentaires sont utilisées.