For larger database setups, companies use replication to ensure that data is passed to another server. This server is usually a secondary physical machine that imports data from the main publisher. The configuration is a "master-slave" setup where the master database is the original storage machine and the slave is the recipient of the replicated data. Replication configurations assume that you have two different MySQL servers set up. This article explains the benefits of replication and how to set up your MySQL environment.
Why Use Replication?
You might first wonder if there are any benefits to creating complexity among your database servers. Replication requires another MySQL server, and it requires additional configurations. It generally takes more maintenance and man hours to configure and monitor the replication service. However, there are several benefits for the business and database administrators.
First, your applications no longer rely on one database server. If the master server fails, you can temporarily switch the connection to the replicated server to provide stability during a critical outage. This includes if the network fails or server hardware crashes the physical machine.
Second, performance actually increases even though complexity would make most administrators think that it would cause performance degradation. When you spread data across multiple servers, you can connect different applications to each server to improve performance. This is generally the way data centers work – they connect a user to the closest available server to reduce response times.
Most businesses use transactional database tables, which means the preferred storage engine is InnoDB. With replication, commits are first written to the network instead of to the hard drive like they do with one physical server. Writes are done synchronously for one physical server to the hard drive, so writing to the network greatly increases performance.
Replication is also a type of disaster recovery database backup that's more efficient than storing data to disks. With replication, you can restore your master server with replicated data instead of digging into backup files.
The basic configuration is master-slave where the master handles the write transactions and the slave server only reads the data into a mirrored database. You can also set up master-master solutions, but this is for more advanced enterprise platforms. With a master-master setup, you can create a load balanced environment where the servers share the load between multiple transactions. The MySQL servers have a load balancer between the application and the databases, and the load balancer sends requests to the database that can handle each transaction with the best performance.
Even with the fastest network, you have replication lag, and this should be considered when you set up your environment. Lag is not a large concern if you only use a replicated database for a backup or for services such as reporting. It's common to have a 24-hour lag between production data and reporting tools, so lag time isn't a main concern. However, if the replication server is a foundation for important transaction used for production, lag time should be closely monitored to avoid any data integrity issues.
With these advantages in mind, let's take a look at how you can set up replication on a MySQL server environment.
Setting Up the Master Database
This file is necessary for the master server in a replicated environment. You can also set the size of this log to determine how many days of data it stores. It's this file that feeds the slave server, so you should have a relatively large binary log file to ensure that you keep records available for replication.
Your master database writes each writable event to the binary log file. A writable event, if you recall, is any statement that changes data. This includes any UPDATE, INSERT, and DELETE statement. The SELECT statement doesn't change any data, so this event is not recorded and it's not needed by the replicated server.
When the slave server connects to its master replication service, it connects in the same way a client connects to the server. It requests a connection on the configured port, which is 3306 by default in MySQL. It's important to remember that any maximum connection threshold reached can cause issues with the replication service, so consider your replication service when you set up a maximum connection configuration.
When the slave connects, it reads the binary log file for a list of events. It's important to have a short delay in the amount of time that the slave server returns to read events. If you have the service set up to only read every other day, you run the risk of the master database no longer having this data cached in the log file. The result is a disk error and possible data loss.
The first step to set up a replication system is to give your server an ID. This is done using the configuration files we discussed in the previous chapter. You'll recall that we discussed the mysqld section. The following is an example of a my.cnf file used to configure the MySQL server:
[mysqld]
server-id=1
binlog-format = mixed
log-bin=mysql-bin
datadir=/var/lib/mysql
innodb_flush_log_at_trx_commit=1
sync_binlog=1
Notice that the first line defines the ID for the server. In this example, the MySQL server is given an ID of 1.
After you've created the necessary configuration file, you must restart the MySQL database server. In case you forgot how to restart your server, type the following command into your MySQL shell.
mysqld restart
With the master database server configured, the administrator needs to create a user that has replication access. We discussed how to create a MySQL user, but creating this user is somewhat different in syntax. The following command creates your replication user.
CREATE USER replication_user@<<slave-server-ip>>;
GRANT REPLICATION SLAVE ON *.* TO replication_user@<<slave-server-ip>>
IDENTIFIED BY '<<password>>';
The first MySQL statement creates the user for the slave database IP address. If you recall from previous chapters, creating the user does not automatically give it access to any databases. As a matter of fact, the user has access rights to nothing on the database server.
The next line of code grants the user access to the replication database. The granted right is replication slave on all databases on the MySQL server. In this example, the user has the right to ping the master server for the data needed for the replicated tables.
After you've set up your user, you need to create a backup. This backup will be transferred to the slave server. Since the slave server needs records from the master database, using this mysqldump output will reduce the load time of new data on the slave server. This is especially beneficial if you have an extremely large database that you want to replicate.
Remember, this command dumps SQL statements to a file and not the actual data. You use these statements to load up your new slave machine with the necessary data and table structure. It also adds primary and foreign key constraints. It takes a complete snapshot of your whole architecture, which you want to replicate on your slave server.
To create a new dump file, use the following code:
mysqldump --skip-lock-tables --single-transaction --flush-logs --hex-blob --master-data=2 -A > ~/dump.sql
Let's examine what the statement does and its options. The first option skips locking tables, so it won't affect your database performance if it's currently running in production. The "single transaction" option sends a START TRANSACTION command before it dumps the file contents. If you remember, nnoDB storage engines support transactional statements that help ensure data integrity.
The "flush logs" option flushes all logs before creating the dump, so you have no uncommitted or unfinished data transactions before dumping your information to the backup file.
The "hex blob" option uses hexadecimal notation when using blob fields. Blob fields are large storage fields for binary data such as images or files.
The final "master data" option writes the binary log file information to the dump, so it can be referenced later. In this example, we use a value of 2.
All information is exported to a dump.sql file that contains your database information. If the data is unusually large, then you might need to zip the log file using Gzip. In case you forget, the following command compresses the log file after exporting your database schema and data to the sql dump file.
mysqldump --skip-lock-tables --single-transaction --flush-logs --hex-blob --master-data=2 -A | gzip > ~/dump.sql
After you create the dump file, you need to review it for two pieces of information. You need these two pieces of information when you configure the slave server. Review the file with any text editor and jot down the values for the following two variables:
MASTER_LOG_FILE
MASTER_LOG_POS
That's all you need to do on the master server. You'll need to send the sql file dump to your slave server, because we'll need it for the slave configurations.
Before you move on to the slave server, you can take a look at the master server status to ensure that you don't have any errors. Run the following statement on your master database:
SHOW MASTER STATUS;
You will see output similar to the following:
File Position Binlog_Do_DB Binlog_Ignore_DB
----------------------------------------------------------------------------------------------------------
Mysql-bin.0000001 100 sampledatabase
The position value should be the same as the MASTER_LOG_POS variable you took note of earlier. This is the position where the replication server will start pulling data.
Setting Up a Slave Server
With your dump.sql file in hand, it's time to set up the slave server. The following commands and configurations should be done on the slave server. Ensure that you move to your slave server before using these setup steps.
Just like the master server, you also need to configure the slave server's my.cnf file with the right configurations for a slave configuration.
The following is an example of a mysqld section for a slave server.
[mysqld]
server-id = 101
binlog-format = mixed
log_bin = mysql-bin
relay-log = mysql-relay-bin
log-slave-updates = 1
read-only = 1
You'll notice that this file is different than the configurations for the slave server. The server ID is also different. Each of your MySQL servers should have a different ID assigned. The slave server is set with the same binary log settings, but the data is read only. This is because the slave server is completely dependent on the master server. You don't want users inserting records on the slave server without fist inserting data in the master server. Since the master server doesn't poll the slave server for data, inserting data on the slave server leaves you with data in one server and not the other. The synchronization is one way, so data should only flow from master to server.
After you configure your server, you must restart your MySQL server just like you did with the master. Type the following code into your shell command.
mysqld restart
Now it's time to import the data and SQL schema into the replication server. If you compressed the file before you dumped it to the backup file, you first need to extract the information from the compressed file into the original .sql dump file.
You have the sql server exported, but now you need to import the statements to the slave server's database. Remember that mysqldump exports data as statements, so it can take a while for all statements to run.
To import your new slave database, run the following command on your slave MySQL server:
mysql –u root –p < ~/dump.sql
Of course, this command assumes that the dump file is in the same user directory as the previously exported sql file.
If you compressed the file first, you first need to extract the information. The following statements show you how to first extract the sql file and then import the dump.sql file to your slave database.
gunzip ~/dump.sql.gz
mysql -u root -p < ~/dump.sql
If you run Windows, you'll need to use a third-party tool to extract the sql file. This command assumes you are running a Unix based system.
Now you need to tell the MySQL slave server to use its master server log file, and how it can connect to the master server in the future. You need to specify a few pieces of information Just like the master server, you also need a user account for your slave server. The following command sets up a server with the proper user permissions.
CHANGE MASTER TO MASTER_HOST='<<master-server-ip>>',MASTER_USER='replicant',MASTER_PASSWORD='<<slave-server-password>>', MASTER_LOG_FILE='<<value from above>>', MASTER_LOG_POS=<<value from above>>;
START SLAVE;
The above command is a bit different than the master server user creation. The first statement asks you to specify the master server. Of course, your slave server needs to know this information to connect to the right master server. You also need to specify the master password.
Remember the MASTER_LOG_FILE and MASTER_LOG_POS values you recorded earlier? These values are used with this statement.
The final statement tells MySQL server to start the slave service. This is similar to kickstarting the server when it's ready to run after configurations are finished.
After you're finished with the configurations and import, you can check your local master and slave configuration by running the following statement:
SHOW SLAVE STATUS \G
This command can be run at any time to ensure that the slave database is polling the master server for new information. If everything is running smoothly, the following output is displayed.
Last_Error
Slave_IO_State….. Waiting for master to send event
Seconds_Behind_Master….. 2
The Last_Error statement should be blank. This value indicates the last error the MySQL database found when it polled the master server.
The Slave_IO_State value should be set to "waiting for master to send event." This means that the slave server is waiting for the master server to have new write events in the binary log. When there are no updates, no data is sent to preserve network and hardware resources. Data is only moved when changes are made. Remember, no data changes when you run a SELECT statement, so any pure read statements won't show up on the slave server. If you use any SHOW commands, they do not log either since they don't change any information on the database.
The Seconds_Behind_Master value is the number of seconds the MySQL slave user lags behind the master server. In this example, it's 2 seconds. If you have a busy MySQL server database, you might want to change this value. Smaller servers can work with larger lag times. Fewer polls to the database will preserve performance on the master database and the network resources. However, if you transfer large amounts of data because you've increased the lag time, this can harm network resources and performance. You'll need to work with different values to identify which one is best for your network.
It should be noted that all of these steps can be done while the MySQL database server runs in production, but always take a backup of your data before you make large configuration changes such as a master-slave configuration. The mysqldump file is the backup you need, but if you add replication to a slave server running other application data, you should take a backup before changing the database configurations.
Common Errors and Troubleshooting for MySQL Replication
In some cases, you might accidentally change the data in your replication server. You don't want to manually change data in a replication server. Instead, you always push data to the master server or edit the data on the master server. However, it happens on occasion from either accidentally running queries on the wrong server or an administrator doesn't realize the slave database is in a replication environment.
If this happens, you must first remove the data changes either through transaction logs or manually editing the data. You could redo these steps to recreate the replicated slave database. Whatever choice you make, the slave database should always be a mirrored copy of the master database, and you should not manually change data in the slave environment. This can cause errors on your server and destroys data integrity if you need to use the replicated data as a support backup when the master crashes.
After you fix the data, you should run the following command on your slave server:
STOP SLAVE;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
START SLAVE;
This statement first stops the MySQL server. It then tells the slave server to skip the latest master server events. In this example, we skip one event. We then restart the MySQL database to return it to production.
This step is also useful if you find that the slave server received an error during replication. This can happen for numerous reasons. A few causes for this issue are a network connectivity error, a malformed statement, corrupted statements or a crash on either server. You should use this statement carefully, though. If you skip too many events, you miss important transactional changes to your database data.
Some database administrators lock tables as they replicate data from the master server to the mysqldump file. This stops the MySQL database server from writing any data to your tables. We provided a solution that does not interrupt users, but this isn't always an option if you need to stop write statements while you replicate data. It's often used by database administrators responsible for very active database servers.
When you lock tables, you must unlock them afterwards. If you notice that write statements are failing on your master database, run the following command to unlock your SQL tables.
UNLOCK TABLES;
That's it – the statement unlocks your tables, so your master database can continue to operate as a production server.
This article showed you how to replicate your data. If you have the resources and the time to create a replicated environment, you can greatly improve disaster recovery efforts if your main database ever crashes.