Mysql wsrep issue

Asked by Philip Wege

HA Setup with 3 controllers , devices went down due to power outage once devices were backup controller 1 is refusing to join the mysql cluster again , please see error below:

Error on Node-1 ( Controller 1 )

<27>Apr 10 13:22:21 node-1 mysqld: 140410 13:22:21 [Note] WSREP: wsrep_start_position var submitted: '3cec74d0-aad8-11e3-0800-86e885e84480:11010669'
<27>Apr 10 13:22:21 node-1 mysqld: 140410 13:22:21 [Note] WSREP: SST received: 3cec74d0-aad8-11e3-0800-86e885e84480:11010669
<27>Apr 10 13:22:21 node-1 mysqld: 140410 13:22:21 [Note] WSREP: Receiving IST: 40494 writesets, seqnos 11010669-11051163
<27>Apr 10 13:22:21 node-1 mysqld: 140410 13:22:21 [ERROR] Slave SQL: Could not execute Update_rows event on table keystone.token; Can't find record in 'token', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 8290, Error_code: 1032
<27>Apr 10 13:22:21 node-1 mysqld: 140410 13:22:21 [Warning] WSREP: RBR event 2 Update_rows apply warning: 120, 11010671
<27>Apr 10 13:22:21 node-1 mysqld: 140410 13:22:21 [ERROR] WSREP: receiving IST failed, node restart required: Failed to apply app buffer: UxFS, seqno: 11010671, status: WSREP_FATAL
<27>Apr 10 13:22:21 node-1 mysqld: at galera/src/replicator_smm.cpp:apply_wscoll():49
<27>Apr 10 13:22:21 node-1 mysqld: at galera/src/replicator_smm.cpp:apply_trx_ws():120
<27>Apr 10 13:22:21 node-1 mysqld: 140410 13:22:21 [Note] WSREP: Closing send monitor...
<27>Apr 10 13:22:21 node-1 mysqld: 140410 13:22:21 [Note] WSREP: Closed send monitor.

And on Node-2 ( Controller 2 who is the donor ) the following error is shown:

<27>Apr 10 13:22:21 node-2 mysqld: 140410 13:22:21 [ERROR] WSREP: async IST sender failed to serve tcp://10.100.252.3:4568: ist send failed: 1', asio error 'Connection reset by peer': 104 (Connection reset by peer)
<27>Apr 10 13:22:21 node-2 mysqld: at galera/src/ist.cpp:send():744

Has anyone seen this before ?

Question information

Language:
English Edit question
Status:
Solved
For:
Fuel for OpenStack Edit question
Assignee:
No assignee Edit question
Solved by:
Philip Wege
Solved:
Last query:
Last reply:
Revision history for this message
Miroslav Anashkin (manashkin) said :
#1

Greetings Philip,

What is your Fuel/Mirantis OpenStack version?

Kind regards,
Miroslav

Revision history for this message
Philip Wege (s-philip) said :
#2

Hi Miroslav

It is version 4.1

Revision history for this message
Miroslav Anashkin (manashkin) said :
#3

Greetings Philip,

Please confirm, you have simply started up all 3 controllers at the same time and Pacemaker failed to assemble Galera cluster back.

And please try simply reboot all 3 controllers at the same time and wait 5-10 minutes after all 3 started.
If Pacemaker fail to assemble Galera automatically - then we proceed with manual OpenStack cluster startup.

Kind Regards,
Miroslav

Revision history for this message
Philip Wege (s-philip) said :
#4

Hi Miroslav

yes they booted up same time , I have rebooted them all at the same time now and it is reversed , node 1 mysql started and is in the cluster but node 2 and node 3 is missing currently.

crm status says node 3 mysql is up but its not showing in mysql -e "show status like 'wsrep%';"
and crm status shows this for node 2 p_mysql (ocf::mirantis:mysql-wss): Started node-2 FAILED

Revision history for this message
Philip Wege (s-philip) said :
#5

Node 3 has the following error in mysqld log :

<27>Apr 10 15:11:46 node-3 mysqld: 140410 15:11:46 [ERROR] WSREP: Local state seqno (11078833) is greater than group seqno (11010801): states diverged. Aborting to avoid potential data loss. Remove '/var/lib/mysql//grastate.dat' file and restart if you wish to continue. (FATAL)

Revision history for this message
Miroslav Anashkin (manashkin) said :
#6

Hi Philip,

Then you have to check the following first:

1. Please run `mysql -e "show status like 'wsrep%';"` on each controller and store its output to have it under the hand.

2. Check /var/lib/mysql/grastate.dat contents on each controller, including the one with operable Galera.
This link describes what is possible to encounter in this file:
http://galeracluster.com/documentation-webpages/restartingacluster.html

You have to find the grastate.dat file with greatest UUID value among the controllers, and with correct seqno value at the same time.
You should also remember the nodes with seqno: -1 - these nodes probably have either uncommitted transactions or simply crashed. It is not recommended to start up new Galera cluster from such nodes. Please mark such grastate.dat files as incorrect.

3. Please compare the UUID values from correct grastate.dat files only with ones, reported by `mysql -e "show status like 'wsrep%';"`
command. We need to be sure your started node started from some existing transaction and not with the empty database.
In case UUID from grastate.dat is greater than from `mysql -e "show status like 'wsrep%';"`. These values should match.

Our goal is to find the node to start Galera cluster with.

As an alternative, you may share the output from `mysql -e "show status like 'wsrep%';"` and contents of all grastate.dat files somewhere in http://paste.openstack.org/ and I'll check them myself.

Kind regards,
Miroslav

Revision history for this message
Miroslav Anashkin (manashkin) said :
#7

Correction:

In case UUID from grastate.dat is greater than from `mysql -e "show status like 'wsrep%';"` - it may indicate new cluster started from empty database.
These values should match.

Revision history for this message
Philip Wege (s-philip) said :
#9

Node 1 seq number : -1 so its crashed
Node 2 seq Number : -1 so its crashed
node 3 seq number : 11078833

Revision history for this message
Miroslav Anashkin (manashkin) said :
#10

Hi Philip,

It looks like your node-3 has successfully committed most recent transaction.
And your power loss happened before the remained 2 nodes synchronized with node-3.
So, it should be safe to start up cluster with this node (node-3).

Please do the following:

Go to node 1.
Move /var/lib/mysql/grastate.dat somewhere.
Run `service mysql stop` command.
Wait 1-2 minutes and check if MySQL on this node has been automatically restarted by Pacemaker and this node joined the existing cluster.
If node entered the cluster successfully - repeat the same procedure on the node 2.

After you get Galera cluster assembled - you may proceed with RabbitMQ and Neutron services restoring.

Kind regards,
Miroslav

Revision history for this message
Philip Wege (s-philip) said :
#11

Thanks Miroslav

But I get this error on mysql restart : service mysql stop
mysql: unrecognized service

Revision history for this message
Philip Wege (s-philip) said :
#12

Would it be advised to kill the services manually ?

Revision history for this message
Miroslav Anashkin (manashkin) said :
#13

Then you may run
`/etc/init.d/mysql stop`

Or kill.

Revision history for this message
Philip Wege (s-philip) said :
#14

Node 1 and node 2 is now in a cluster together but they created a new uuid betwene the two of them node 3 is unchanged with old uuid.

Revision history for this message
Miroslav Anashkin (manashkin) said :
#15

Oh, classical split-brain.
Looks like bug in Pacemaker scripts.

So, `mysql -e "show status like 'wsrep%';"` shows cluster_size for node 1 and 2 is 2 and for node 3 is 1?

Revision history for this message
Philip Wege (s-philip) said :
#16

Yup that is correct.

Revision history for this message
Philip Wege (s-philip) said :
#17

Thanks Miroslav for your help , all is sorted , all three are back in the cluster.

Revision history for this message
Miroslav Anashkin (manashkin) said :
#18

Greetings Philip,

Sorry for delay with answer, it was already late night here in Moscow and I had no working HA environment under the hand.

It would be helpful is you describe steps you have done to sort out this issue, so others might read this answer and find solution.

Additionally, after you get Galera and RabbitMQ clusters operable, you may have to restart Neutron services, since these depends on MySQL and RabbitMQ.

You should run the following set of commands on any single controller only, since these affect whole cluster.
Please verify the exact resource names with `crm status` or `crm resource list`, names differs between CentOS and Ubuntu.
Correct restart command sequence is as following (based on Ubuntu resource names):

`crm resource cleanup clone_p_neutron-metadata-agent`
`crm resource cleanup clone_p_neutron-plugin-openvswitch-agent`
`crm resource restart clone_p_neutron-plugin-openvswitch-agent`
`crm resource restart clone_p_neutron-metadata-agent`

Normally, you should not restart p_neutron-dhcp-agent and p_neutron-l3-agent, both should be restarted automatically with required parameter set in scope of clone_p_neutron-plugin-openvswitch-agent start.

And after getting Neutron back you have to restart all network-dependent Nova services, or simply all Nova services.

Kind regards,
Mirosdlav

Revision history for this message
Philip Wege (s-philip) said :
#19

Hi Mrioslav

After the split brain happened , I confirm that when I create a new volume or instance it is created on the database on node -1 and node-2 and then i removed node-3 grastate and node-3 joined node-1 and node-2 in the cluster.

All three grasate files has the same uuid and seq no now but the seq no on all three is -1.

My setup doesnt use neotron , it uses vlan manager.All nova and cinder services was restarted.