Munin is a powerful system monitoring tool.  What makes it so versatile is that you can write your own plugins to monitor any parameter you want.

In this post I’ll show how you can monitor the status of a MySQL replication. We’ll be monitoring the lag between master and slave, and whether the slave is running. These plugins are based on examples published on the Munin website.

I’ll assume that you have already a Munin server up and running, and that the machine acting as slave is already set up as a Munin node.
To monitor the MySQL replication we’ll be logging in to MySQL as slave_user, password s3cr3t; note that slave_user must have the SUPER privilege. The slave machine is slave.yourdomain.org with IP address 10.7.7.7.

Login on the slave, and create a file /usr/share/munin/plugins/mysql_replag :

#!/bin/sh 
# Plugin to monitor the Seconds_Behind_Master of replication on a MySQL slave
  
MYSQLOPTS=$mysqlopts
MYSQL=${mysql:-mysql}   
if [ "$1" = "autoconf" ]; then 
	$MYSQL --version 2>/dev/null >/dev/null 
	if [ $? -eq 0 ] 
	then 
		$MYSQL $MYSQLOPTS -e '' 2>/dev/null >/dev/null 
		if [ $? -eq 0 ] 
		then 
			echo yes 
			exit 0 
		else 
			echo "no (could not connect to mysql)" 
		fi 
	else 
		echo "no (mysql not found)" 
	fi 
	exit 1 
fi   

if [ "$1" = "config" ]; then 
	echo 'graph_title Replication lag' 
	echo 'graph_args --base 1000 -l 0' 
	echo 'graph_vlabel lag in secs' 
	echo 'graph_category mysql' 
	echo 'lag.label lag' 
	exit 0 
fi   

/usr/bin/printf "lag.value " 
mysql $MYSQLOPTS -e 'show slave status\G' | grep Seconds_Behind_Master | awk '{print $2}'

This plugin will produce a Munin graph showing the lag (in seconds) between master and slave.

Now create another file, /usr/share/munin/plugins/mysql_repstatus :

#!/bin/sh 
# Plugin to monitor the replication status on a MySQL slave

MYSQLOPTS=$mysqlopts
MYSQL=${mysql:-mysql}   
if [ "$1" = "autoconf" ]; then 
	$MYSQL --version 2>/dev/null >/dev/null 
	if [ $? -eq 0 ] 
	then 
		$MYSQL $MYSQLOPTS -e '' 2>/dev/null >/dev/null 
		if [ $? -eq 0 ] 
		then 
			echo yes 
			exit 0 
		else 
			echo "no (could not connect to mysql)" 
		fi 
	else 
		echo "no (mysql not found)" 
	fi 
	exit 1 
fi   

if [ "$1" = "config" ]; then 
	echo 'graph_title Replication status' 
	echo 'graph_args -l 0 --upper-limit 2' 
	echo 'graph_vlabel slave is running' 
	echo 'graph_category mysql' 
	echo 'status.label running: 1=yes, 0=no' 
	exit 0 
fi   

/usr/bin/printf "status.value " 
mysql $MYSQLOPTS -e 'show slave status\G' | grep "Slave_IO_Running: Yes" >/dev/null 
echo $(( ! ${PIPESTATUS[1]} ))

This plugin will produce a Munin graph with a value of 1 if the slave is running, 0 otherwise.

Then create the two configuration files for these plugins, /etc/munin/plugin-conf.d/mysql_replag and /etc/munin/plugin-conf.d/mysql_repstatus :

[mysql_rep*]
user root
env.mysqlopts -uslave_user -ps3cr3t -h 10.7.7.7 

(You can just create the first one, and make the second one as a symlink to the first.)
This will keep the configuration (with the MySQL user, password, and host) separated from the plugin, which is always a good thing.

To use these plugins, ensure they’re executable, and symlink them to the Munin plugin directory. Then restart the Munin node daemon:

# chmod 755 /usr/share/munin/plugins/mysql_rep*
# ln -s /usr/share/munin/plugins/mysql_replag /etc/munin/plugins/
# ln -s /usr/share/munin/plugins/mysql_repstatus /etc/munin/plugins/
# service munin-node restart

Now it’s a good idea to test if these plugin are working correctly:

# munin-run mysql_replag
# munin-run mysql_repstatus

You should get respectively the values 0 and 1, if both the plugin and the replication work.

Login on the Munin master and edit the file /etc/munin/munin.conf to add some alerts. (As you’re already monitoring the slave server, you might already have some alarms configured in the relevant group of the host tree.)

[slave.yourdomain.org]
    address 10.7.7.7
    mysql_replag.lag.critical 10
    mysql_repstatus.status.critical 1:

This will issue a critical-level alert if the replication lag is more than 10 seconds, or if the slave stop running. Feel free to adjust these values as you wish.)

You can also have Munin launch a warning for a lower replication lag, e.g.

    mysql_replag.lag.warning 5

however, note that this will draw a line in the graph at 5 seconds, due to a bug.

Restart the munin-node service so that Munin takes into account the changes to the configuration:

# service munin-node restart

Your Munin master should start to monitor the replication immediately.

2 réponses
    • Daniele Raffo
      Daniele Raffo dit :

      Good point. This script was tested on a RHEL machine, where /bin/sh is a symlink to /bin/bash. People using Debian/Ubuntu will need to change #!/bin/sh to #!/bin/bash.

      Répondre

Laisser un commentaire

Participez-vous à la discussion?
N'hésitez pas à contribuer!

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée.

Ce site utilise Akismet pour réduire les indésirables. En savoir plus sur comment les données de vos commentaires sont utilisées.