Book Image

MySQL Admin Cookbook LITE: Replication and Indexing

Book Image

MySQL Admin Cookbook LITE: Replication and Indexing

Overview of this book

Table of Contents (3 chapters)

Limiting network and slave I/O load in heavy write scenarios using the blackhole storage engine

If you have a large number of slaves and a rather busy master machine, the network load can become significant, even when using compression. This is because all statements that are written to the binlog are transferred to all the slaves. They put them in their relay logs and asynchronously process them.

The main reason for the heavy network load is the filter on the slave paradigm that MySQL employs. Everything is sent to every one of the slaves and each one decides which statements to throw away and which to apply based on its particular configuration. In the worst case, you have to transmit every single change to a database to replicate only a single table.

Getting ready

The following procedure is based on Linux. So in order to repeat it on Windows, you need to adapt the path names and a little shell syntax accordingly.

To follow along, you will need a MySQL daemon with the blackhole storage engine enabled. Verify this with the following command:

mysql> show variables like '%blackhole%';
Getting ready

Even though you only strictly need a blackhole-enabled MySQL server on the actual filter instance, for this example we will be using only a single machine and just a single-server version, but with different configuration files and data directories.

In the following steps, we assume you have installed a copy of MySQL in a folder called blacktest in your home directory. Modify accordingly if your setup differs.

How to do it...

  1. Create three distinct data directories one for the master, one for the blackhole filter engine, and one for a slave.
    ~/blacktest$ mkdir data.master
    ~/blacktest$ mkdir data.slave
    ~/blacktest$ mkdir data.black
    
  2. Into each of those, copy the MySQL accounts database. Ideally, you should take an empty one from a freshly downloaded distribution to make sure you do not accidentally copy users you do not want.
    ~/blacktest$ cp -R data/mysql data.master
    ~/blacktest$ cp -R data/mysql data.slave
    ~/blacktest$ cp -R data/mysql data.black
    
  3. Configure the master instance. To do so, create a configuration file called my.master and make sure that it contains the following settings:
    [client]
    port = 3307
    socket = /home/ds/blacktest/master.sock
    [mysqld_safe]
    socket = /home/ds/blacktest/master.sock
    [mysqld]
    user = mysql
    pid-file = /home/ds/blacktest/master.pid
    socket = /home/ds/blacktest/master.sock
    port = 3307
    basedir = /home/ds/blacktest
    datadir = /home/ds/blacktest/data.master
    tmpdir = /tmp
    language = /home/ds/blacktest/share/mysql/english
    bind-address = 127.0.0.1
    server-id = 1
    log-bin = /home/ds/blacktest/master-bin.log
    

    Everything that is specific to the master instance has been highlighted—those values are going to be different for filter and slave instances.

  4. Start the master daemon for the first time to make sure everything works so far. We recommend a dedicated window for this daemon. For example:
    ~/blacktest$ xterm -T MASTER -e bin/mysqld \
    > --defaults-file=my.master \
    > --console &
    

    This will start the daemon in the background and show its output in a new window:

    How to do it...

    The warning about the --user switch can be ignored for now. Should you not get a message very similar to the one above (especially concerning the ready for connections part) go back and find the error in your setup before going on. Usually, the error messages issued by MySQL are rather verbose and bring you back on track pretty soon.

  5. Insert some test data to be able to verify the correct function of the filter later. To do so, connect to the master instance just started and create some tables and data:
    ~/blacktest$ bin/mysql -uroot -S master.sock --prompt='master>'
    master> CREATE DATABASE repdb;
    master> USE repdb;
    master> CREATE TABLE tblA (
    -> id INT(10) PRIMARY KEY NOT NULL,
    -> label VARCHAR(30)
    -> ) ENGINE=InnoDB;
    master> CREATE TABLE tblB (
    -> name VARCHAR(20) PRIMARY KEY NOT NULL,
    -> age INT(3)
    -> ) ENGINE=InnoDB;
    master> INSERT INTO tblA VALUES
    -> (1, 'label 1'),
    -> (2, 'label 2'),
    -> (3, 'label 3');
    master> INSERT INTO tblB VALUES
    -> ('Peter', 55),
    -> ('Paul', 43),
    -> ('Mary', 25);
    

    Inserting this data already creates binlog information. You can easily verify this by looking at the file system. The master-bin.000001 file should have grown to around 850 bytes now. This might vary slightly if you did not enter the commands above with the exact same number of spaces—the binlog will store commands in the exact way you typed them. For example, we will only replicate changes to table tblB but ignore anything that happens to table tblA. We will assume that tblB needs to be written by an application on the slave. So the table should be present, but empty on the slaves to avoid key collisions.

  6. Create a user account on the master for the filter to connect with:
    master> GRANT REPLICATION SLAVE
    -> ON *.*
    -> TO 'repblack'@'localhost'
    -> IDENTIFIED BY 'blackpass';
    
  7. Configure the filter (blackhole) instance with a configuration file named my.black that contains at least the following :
    [client]
    port = 3308
    socket = /home/ds/blacktest/black.sock
    [mysqld_safe]
    socket = /home/ds/blacktest/black.sock
    [mysqld]
    log-slave-updates
    skip-innodb
    default-storage-engine=blackhole
    user = mysql
    pid-file = /home/ds/blacktest/black.pid
    socket = /home/ds/blacktest/black.sock
    port = 3308
    basedir = /home/ds/blacktest
    datadir = /home/ds/blacktest/data.black
    tmpdir = /tmp
    language = /home/ds/blacktest/share/mysql/english
    bind-address = 127.0.0.1
    server-id = 2
    log-bin = /home/ds/blacktest/black-bin.log
    relay-log = /home/ds/blacktest/black-relay.log
    

    Note

    Notice that all occurrences of master have been replaced with black!

    Moreover, the server-id setting has been changed and the log-slave-updates, skip-innodb, and default-storage-engine options have been added. The second one prevents this instance from creating ibdata table space files, which would not be used later anyway. The last one specifies which storage engine to use when a CREATE TABLE statement does not explicitly specify one or if the specified engine is not available. We will come back to this soon.

  8. Make sure this instance basically works by starting it the same way as the master before (you will not see the InnoDB messages here, of course).
    ~/blacktest$ xterm -T BLACK -e bin/mysqld \
    > --defaults-file=my.black \
    > --console &
    
  9. Create a set of dump files from the master to set up both the blackhole filter and an example slave. The details on why we need two and in which ways they are different will be explained later. Use these commands to create the files needed:
    ~/blacktest$ bin/mysqldump -S master.sock -uroot \
    > --master-data \
    > --single-transaction \
    > --no-create-info \
    > --ignore-table=repdb.tblA \
    > repdb > master_data.sql
    ~/blacktest$ bin/mysqldump -S master.sock -uroot \
    > --no-data \
    > repdb > master_struct.sql
    
  10. Connect to the filter server, create the database, make it the default database, and finally, import the structure information created before:
    ~/blacktest$ bin/mysql -uroot -S black.sock --prompt='black> '
    black> CREATE DATABASE repdb;
    black> USE repdb;
    black> source master_black.sql;
    

    At this point we now have the structure of the master transferred to the filter engine adapted to use the blackhole engine for all the tables.

  11. Set up the replication between master and filter engine. To do so, we need to know the exact position from where the filter will start replicating. Extract this information from the previously taken data dump like this:
    ~/blacktest$ head -n 30 master_data.sql | grep 'CHANGE MASTER TO'
    

    Write down that information; we will need it in a moment.

  12. Modify the my.black configuration file to contain the following in the [mysqld] section:
    replicate-ignore-table=repdb.tblA
    replicate-do-table=repdb.tblB
    

    This is a very simple filter setup; in a real application scenario these rules will probably be more complex.

  13. Restart the filter engine to activate the new configuration:
    ~/blacktest$ bin/mysqladmin -uroot -S black.sock shutdown
    ~/blacktest$ xterm -T BLACK -e bin/mysqld \
    > --defaults-file=my.black \
    > --console &
    
  14. Reconnect the client connected to the blackhole engine. To do this, just issue a SELECT 1; command.
  15. Execute the following command to hook up the filter to the master. Be sure to fill in the values you wrote down a moment ago in the statement:
    black> CHANGE MASTER TO
    -> master_host='localhost',
    -> master_port=3307,
    -> master_user='repblack',
    -> master_password='blackpass',
    -> master_log_file='master-bin.000001',
    -> master_log_pos=1074;
    
  16. Retrieve information required to set up the filter/slave portion. Write down the results of the SHOW MASTER STATUS command, they will be needed later:
    black> FLUSH LOGS;
    black> SHOW MASTER STATUS;
    +------------------+----------+---+---+
    | File | Position | … | … |
    +------------------+----------+---+---+
    | black-bin.000003 | 98 | | |
    +------------------+----------+---+---+
    
  17. Start the slave thread on the filter engine and verify that everything is going well:
    black> START SLAVE;
    black> SHOW SLAVE STATUS \G
    ************************** 1. row ***************************
    Slave_IO_State: Waiting for master to send event
    Master_Host: localhost
    Master_User: repblack
    Master_Port: 3307
    Connect_Retry: 60
    Master_Log_File: master-bin.000001
    Read_Master_Log_Pos: 1074
    Relay_Log_File: black-relay.000003
    Relay_Log_Pos: 236
    Relay_Master_Log_File: master-bin.000001
    Slave_IO_Running: Yes
    Slave_SQL_Running: Yes
    ...
    Replicate_Do_Table: repdb.tblB
    Replicate_Ignore_Table: repdb.tblA
    ...
    Last_Errno: 0
    Last_Error:
    Skip_Counter: 0
    Exec_Master_Log_Pos: 1074
    Relay_Log_Space: 236
    ...
    Seconds_Behind_Master: 0
    

    At this point we have successfully established a replication connection between the master database and the blackhole-based filter instance.

  18. Check that nothing has yet been written to the filter's binlogs. Because we issued a FLUSH LOGS command on the filter instance, there should be nothing in the most recent binlog file. Verify this as follows:
    ~/blacktest$ bin/mysqlbinlog black-bin.000003
    
    How to do it...
  19. Test the filter setup with some statements issued on the master:
    master> UPDATE repdb.tblA
    -> SET label='modified label 3'
    -> WHERE id=3;
    master> INSERT INTO repdb.tblB
    -> VALUES ('John', 39);
    

    We would expect to see the INSERT in the binlog file of the filter instance, but not the UPDATE statement, because it modifies tblA, which is to be ignored.

  20. Verify that the rules work as expected by having another look at the filter's binlogs:
    ~/blacktest$ bin/mysqlbinlog black-bin.000003
    
    How to do it...

    This looks precisely as expected—the INSERT is present, the UPDATE is nowhere to be seen.

  21. Set up the configuration of a slave using these settings:
    [client]
    port = 3309
    socket = /home/ds/blacktest/slave.sock
    [mysqld_safe]
    socket = /home/ds/blacktest/slave.sock
    [mysqld]
    user = mysql
    pid-file = /home/ds/blacktest/slave.pid
    socket = /home/ds/blacktest/slave.sock
    port = 3309
    basedir = /home/ds/blacktest
    datadir = /home/ds/blacktest/data.slave
    tmpdir = /tmp
    language = /home/ds/blacktest/share/mysql/english
    bind-address = 127.0.0.1
    server-id = 3
    relay-log = /home/ds/blacktest/slave-relay.log
    

    Note

    Notice that all occurrences of master have been replaced with slave!

    Again the server-id setting has been changed and the log-slave-updates, skip-innodb, and default-storage-engine options that were part of the filter instance's configuration are not included. Also, the log-bin parameter has been removed because changes on the slave need not be recorded separately.

  22. Start up the slave engine. You will see the familiar messages about InnoDB filling up the data files and finally, the Ready for connections line:
    ~/blacktest$ xterm -T SLAVE -e bin/mysqld \
    > --defaults-file=my.slave \
    > --console &
    
  23. Then connect a client to the slave and create the database:
    ~/blacktest$ bin/mysql -uroot -S slave.sock --prompt='slave> '
    slave> CREATE DATABASE repdb;
    slave> USE repdb;
    

    At this point, the slave is set up and has an empty repdb database.

  24. Fill up the slave database with the initial snapshot of the master. We need to load two files here. The details of why are explained further down in the How it works... section.
    slave> source master_struct.sql;
    ...
    slave> source master_data.sql;
    ...
    
  25. Verify that you can find the data from the master on the slave now by doing a SELECT * FROM first table repdb.tblA and then repdb.tblB.
    How to do it...

    The first SELECT shows no records because tblA was excluded from the dump. Table tblB contains the three records we inserted on the master.

  26. Create a replication user account on the filter instance for the slaves to use:
    black> GRANT REPLICATION SLAVE
    -> ON *.*
    -> TO 'repslave'@'localhost'
    -> IDENTIFIED BY 'slavepass';
    
  27. Connect the slave to the filter engine. Be sure to insert the correct values for MASTER_LOG_FILE and MASTER_LOG_POS in the statement. Those are the values you wrote down when you issued the SHOW MASTER STATUS command on the filter server before starting the replication there:
    slave> CHANGE MASTER TO
    -> master_host='localhost',
    -> master_port=3308,
    -> master_user='repslave',
    -> master_password='slavepass',
    -> master_log_file='black-bin.000003',
    -> master_log_pos=98;
    Query OK, 0 rows affected (0.01 sec)
    
  28. Start the slave and verify that it starts up correctly:
    slave> START SLAVE
    slave> SHOW SLAVE STATUS \G
    ************************** 1. row ***************************
    Slave_IO_State: Waiting for master to send event
    ...
    Relay_Master_Log_File: black-bin.000003
    Slave_IO_Running: Yes
    Slave_SQL_Running: Yes
    ...
    Seconds_Behind_Master: 0
    
  29. As soon as the previous step is complete, the replication should already have updated tblB on the slave and inserted the new ("John", 39) record. Verify it like this:
    slave> SELECT * FROM repdb.tblB;
    
    How to do it...

    Apparently, the replication works. You can now try to modify some data on the master and check if the results match on the slave. Anything you do to modify tblB should be reflected on the slave. Remember to use fully qualified statements; otherwise changes will not match the replication rules.

How it works...

Though MySQL did not implement a filter on the master feature literally, another way of doing similar things was provided. While MyISAM and InnoDB implement ways of storing data on disk, another engine was created that is basically an empty shell. It just answers OK to all INSERT, UPDATE, or DELETE requests coming from the SQL layer above. SELECT statements always return an empty result set. This engine is suitably called the blackhole storage engine, as everything you put into it just vanishes.

How it works...

In the upper part you see the main master server. All modifying statements are written on the master's binlog files and sent over the network to subscribed slaves. In this case, there is only a single slave: the filter server in the middle. The thick arrow in between them represents the large amount of data that is sent to it.

In the lower part of the picture, there are a number of slaves. In a regular setup, a thick arrow would be drawn from the master to each of those—meaning that the same massive amount of replication data would be sent over the network multiple times. In this picture, the filter server is configured to ignore statements for certain tables. It is also configured to write the statements received from a replication master to its own binlogs. This is different from regular slaves because usually those do not write replicated statements to their binlogs again. The filter server's binlogs are much smaller than those of the main master because lots of statements have been left out. This would normally have taken place on each and every regular slave.

The regular slaves are configured against the filter server. That means they only receive the pre-filtered stream of statements that have made it into the filter's binlogs through the replicate-ignore-* and replicate-do-* directives. This is represented by thin arrows in the picture.

Because slaves can go offline for extended amounts of time, binlogs could easily mount up to dozens of gigabytes in a few days. With the much smaller filtered binlogs you can more often purge the large main master's binlogs as soon as you have made a full backup, in the end freeing more space than is needed by the additional filter instance.

Other storage engines than InnoDB

Be advised that if you are using a different storage engine than InnoDB for your tables (especially MyISAM), you will need to do a little more tweaking. This is because the InnoDB example relies on MySQL's being very lenient concerning errors in many cases. We put the skip-innodb option into the my.black config file. This means that InnoDB will not be available at runtime. Because the master_struct.sql dump file contains CREATE TABLE … ENGINE=InnoDB statements, MySQL falls back to the default storage engine that we configured to be the blackhole engine.

If you are using MyISAM tables, there is no need for the server to automatically change the table type because MyISAM is always available (MySQL stores its user information apart from other things in MyISAM tables). So you would need to adapt the master_struct.sql dump file before sourcing it into the filter server. I recommend using sed, like this:

~/blacktest$ sed -e 's/ENGINE=InnoDB/ENGINE=BLACKHOLE/g' \
> master_struct.sql > master_black.sql

This will replace all occurrences of the InnoDB engine with the blackhole engine and put the result into a new file. Please keep the original file, too, as it will be needed for the slave machines.