I've had a dozen people tell me that setting up a redundant MySQL server was
just too difficult to be worth the effort. After all, what was the likelihood that the database server would ever go down? I mean... its not like tires on cars ever going flat, why would a computer ever break? Besides, the server is on a UPS.
Well... Bad news DBA's: I got it working in under an hour (not counting deploying the VMs and config'ing the base environment.) Its remarkably straight forward: the high altitude view is defined in
RTFM 5.0 16.1.1.7, but of course, there are a few tricks.
Assuming a new cluster build, one will be a master, a second a slave. Start
mysqld and verify functionality on both servers. On master, add to
/etc/my.cnf:
log-bin=mysql-bin
server-id=2024561111 # some random value
On slave, add to
/etc/my.cnf:
server-id=7035713343 # some random value
report-host=slaveXYZ # bonus: name of slave
Restart both servers. Verify
Log into master's MySQL console and add the replication user:
mysql> CREATE USER 'replicant' IDENTIFIED BY 'password';
mysql> GRANT REPLICATION SLAVE ON *.* TO 'replicant';
Here's another trick: The replication is going to execute across the network in the clear, so ultimately, we want to push this through an SSL tunnel. With
stunnel, we can specify the user as 'replicant'@'127.0.0.1' (not
localhost).
Determine the master's log coordinates: (yes, that's what its called)
FLUSH TABLES WITH READ LOCK;
SHOW MASTER STATUS;
UNLOCK TABLES;
We need to document the
File and
Position. Technically, the LOCK and UNLOCK are not needed since this is a new cluster. The right way to do this is to delay the UNLOCK until you verify slave is connected.
Log into slave's MySQL console and define the master:
mysql> CHANGE MASTER TO
-> MASTER_HOST='127.0.0.1',
-> MASTER_PORT=1234,
-> MASTER_USER='replicant',
-> MASTER_PASSWORD='password',
-> MASTER_LOG_FILE='mysql-bin.000001',
-> MASTER_LOG_POS=304;
Another trick here: since we are going to use an SSL tunnel, we need a custom port value. Again, use the IP address, not localhost, to force it the TCP stack. Notice we used the file and position documented above.
The big gap in the documentation is its failure to tell you to actually start replication. On the slave:
mysql> SHOW SLAVE STATUS\G
<snip>
Slave_IO_Running: No
Slave_SQL_Running: No
<snip>
mysql> SLAVE START;
mysql> SHOW SLAVE STATUS\G
Did it work? Check the master:
mysql> SHOW SLAVE HOSTS;
Under
Host you should see the "bonus" name specified in the slave's
/etc/my.cnf.
And that's it. A big misconception seems to be that slave pulls from master, which is accurate, but not true. When master executes a write, it "pings" slave. If slave is up, it executes the pull. This could pose a problem if slave is DHCP. If slave is offline, it resyncs on start, which creates load on master. Depending on master's load, our systems could experience "lag".
Oh, and yes, I'm referencing 5.0, because that's what's included with RHEL5. My old copy of RHEL 6 Beta had 5.1 (I really should get a newer copy.) I glanced over the MySQL 5.5 docs, no big differences.