Writing of this document is in process, please be patient (unless you find an mistake, in that case please step forward!).
Database Replication with MySQL
Setting up replication for an MySQL server might help in several situations:
- ease the load of the primary server when some application just issues select queries
- get rid of the extra load when you need to regularly make backups of your database (mysqldump)
- synchronize configuration data (i.e. master holds SMTP config data, replicate the SMTP tables and let the SMTP server make their configuration from the database)
- have an hot standby server which is able to replace an broken master
How does MySQL replication work at all?
Once the server is configured to write binlogs (log-bin, log-bin-index, one or several binlog-ignore-db statements) it starts writing all updates into its binary log file. You will have to explicitely ignore databases which you do not want to replicate at all, i.e. the mysql database containing users and permissions. Though it might be a good idea to sync this database, too, please note that permissions on IP base might not work for an slave which is located in a different IP network. I usually find it a good idea to have the mysql databases not replicated. You might want to add users and permissions to the slave manually, if you need hot standby functionality to replace an broken master server.
MySQL will log anything except for the ignored databases. It will keep track of the already existing binary logs by keeping its binary index up to date.
MySQL itself will not erase any of its written binary logs, unless you tell it to do so (see housekeeping).
When one or many MySQL clients start to connect to the master, they have to start somewhere. When you start off with an empty database, the situation is quite clear & easy. Set the slave to the current masters binlog file and position and start the slave.
MySQL's replication mechanism is based on a number of statements which get "replayed" to the slave database. When both databases are empty, you can just start applying the diff to the slave to keep it in sync. Once there is data on the master (generally spoken: master & slave databases differ), its not this easy any more. You can still just apply the diffs to the slave and in most cases it will take a quite long time until it will break, but note that the master and slave databases will never be identical, even when the diffs are applied without flaw (i.e. just inserting into the db).
This means we need to "seed" the slave database. There are several ways to do this. Have a look into the Setting up the Slaves section.
You can have as many slaves as you want, the slave keeps track of how much of the masters binlog it already polled. The master server just checks if an slave has permission to get data from the binlog and then just delivers an requested entry from an requested binlog file. It would be a good idea to create one MySQL user for each slave server, though it might not be neccessary.
Prerequisites
All servers in an MySQL master / slave setup need their own unique server id. Make sure you keep track of the id's that are already in use. You might make some notes in the master's my.cnf file, near the server-id statement which defines the id of the master server.
Sample:
- server-id=1
- # hot standby replication server (FQDN) has id 2
- # config replicator (FQDN) has id 10
Keep a backup or written note of the id's. Its not critical because you just need to make sure id's are unique, but it might save you time to reconfigure.
An MySQL slave will connect to the master server through the standard 3306 port using tcp. Make sure that routes are set and firewall is tuned.
Make yourself an map of the required slave servers. Note which databases they need to replicate (in some cases only a single db is required, sometimes everything). Define user names and passwords for each slave and note them for later use (I'd suggest to note them near the server id).
Look at the slave's requirements, then decide which databases should be written to the bin log by the master. Only log the minimum thats required, it will save performance and disk space.
MySQL with SSL capabilities
If the slave server is located far away from the master, you might think about enabling ssl. Otherwise all replication data will be transferred plain text (as any other mysql data transfer between an server and client will, too). Please note that there is no Debian package with enabled ssl currently. You might have to build your own MySQL or use other means to protect your data, i.e. setting up an VPN (OpenVPN).
Setting up the Server
In /etc/mysql/my.cnf
- server-id = 1
- log-bin
- log-bin-index
- expire_logs_days = 10
- max_binlog_size = 100M
- binlog_do_db = include_database_name(comma seperated list of names)
- binlog_ignore_db = include_database_name(comma seperated list of names)
Log into MySQL console and create users for each slave (use the previously prepared credentials):
GRANT REPLICATION SLAVE ON *.* TO '<user>'@'<host/ip>' IDENTIFIED BY '<slavepass>';
If you intend to use the load data from master statements, you will have to add more permissions to the user:
GRANT SELECT, RELOAD, FILE, SUPER, REPLICATION SLAVE ON *.* TO '<user>'@'<host/ip>' IDENTIFIED BY '<slavepass>';
Note: it might be that you don't need SUPER and SELECT for the latter grant, needs to be tested.
The server should now start to write all updates into the binary log, located in the /var/lib/mysql folder, unless you told it to put them elsewhere (append path to log-bin and log-bin-index).
Check the master status with the following command:
- show master status;
Housekeeping
In an normal state all slaves should be up to date with the master, this can be checked by issuing
- show slave status;
on the slave server(s). Look at the binlog file name and position and compare to the masters binlog file and position. One of the interesting variables gives us the seconds behind master, which is an indicator how up to date our slave is. On low/medium volume servers you should expect the counter to be always 0 or at least below 10. You will sometimes note that the slave is a bit off, this occurs when you issue heavy (means long running) update queries on the master. The slave runs the updates serially, which can cause other updates to be delayed.
The server keeps its binlogs until you tell it to prune them. Knowing that slaves normally are only seconds behind the master (or dead and thus need to be resynced anyway) you do not need days or weeks of old bin logs.
Setting expire_log_days properly should help, you can expire old bin log files manually by using the following sql query:
- PURGE BINARY LOGS TO 'mysql-bin.010';
- PURGE BINARY LOGS BEFORE '2008-04-02 22:46:26';
Setting up the Slaves
- set server id
- seed replication with db dump / load data from master
- clients get the whole binlog and filters out what should be replicated to THIS specific client
- starting from scratch is easy (no data that needs to be transferred to start up)
- starting in the middle can be easy (myisam load data from master)
- or a pain (innodb or mixed innodb and myisam tables)
Monitoring
- manually show slave status; show master status;
- Nagios
- ...
Crash Recovery
get an broken replication back to work
reasons of failure
- cross db inserts
- strange truncate conditions
replace an broken master with an up to date slave
command reference
- show slave status
- show master status
- stop slave
- reset slave
- start slave
- ...
Misc
- stored procedures
- views
- ...
Comments please
If you think that something is missing or simply wrong, please drop me an note or fix it by yourself.