Book Image

MySQL Admin Cookbook LITE: Replication and Indexing

Book Image

MySQL Admin Cookbook LITE: Replication and Indexing

Overview of this book

Table of Contents (3 chapters)

Sharing read load across multiple machines

Often you have a very unequal distribution of read and write operations on a database. Websites usually get many more visitors just browsing and reading contents than actually contributing. This results in the database server being mainly busy reading information instead of adding or modifying existing material.

Replication can be used to alleviate scalability issues when your site reaches a certain size and a single machine might reach the limits of its performance reserves.

Unfortunately, MySQL does not offer this load-balancing functionality itself, so you will need to take appropriate actions on the application level.

In this recipe, we will show you the general procedure to follow when sharing read accesses between two slave machines while still aiming writes at the master. Beware that due to the asynchronous nature of MySQL replication, your application must be able to handle slightly out-of-date results because issuing an INSERT, UPDATE, or DELETE against the master will not mean that you can read the modified data back immediately as the slave might take some time to catch up. Usually, on a local network this should be a couple of seconds at most, but nevertheless the application code must be ready for that.

To simplify the scheme, you should design your application to exclusively read from the slaves and only use the master for modifications. This brings the additional benefit of being able to keep the overall system up and running while switching to a read-only mode temporarily, backing up the master server. This is not part of this recipe, however.

The example used in this recipe uses three database servers. The sample application is written in Java, using the MySQL Connector/J JDBC database driver. Depending on what application platform you are using, syntax and function names will differ, but the general principle is language independent.

Getting ready

Depending on your application infrastructure, you will need privileges to change its database connectivity configuration and the source code. This is usually a task that requires cooperation with application developers.

To follow along with this example you should be familiar with the Java language and its basic constructs.

Moreover, you will need three MySQL servers one configured as the master and two others as slaves. They will be referred to as master, slave1, and slave2 in this example. Substitute your concrete host names appropriately.

You will also need the Java Standard Edition development tools available from http://java.sun.com, and the MySQL Connector/JDBC driver available from http://dev.mysql.com. Download and install both, if you do not already have them.

How to do it...

  1. Download the file called MySQLBalancingDemo.java from the book's website. It contains the following code:
    …
    Connection conn = driver.connect("jdbc:mysql://master:3306,slave1:3307,slave2:3308/sakila?user=testuser&password=testpass&roundRobinLoadBalance=true", null);
    conn.setReadOnly(false); // target the MASTER
    rs = conn.createStatement().executeQuery(
    "SELECT @@server_id;");
    rs.next();
    System.out.println("Master: " + rs.getString(1));
    conn.setReadOnly(true); // switch to one of the slaves
    rs = conn.createStatement().executeQuery(
    "SELECT @@server_id;");
    rs.next();
    System.out.println("Slave: " + rs.getString(1));
    conn.close();
    ...
    
  2. Compile the file using the javac compiler. Alternatively, an integrated development environment like Eclipse or Netbeans can take care of this for you:
    $ javac -cp mysql-connector-java-5.1.7-bin.jar MySQLBalancingDemo.java
    
  3. Run the sample application and see how it automatically distributes the read requests between the two slaves:
    $ java -cp .:mysql-connector-java-5.1.7-bin.jar MySQLBalancingDemo
    Master: 1000
    Slave: 13308
    

How it works…

You just compiled and ran a small program that demonstrates round-robin load balancing.

The first line of output is the master's server-ID setting, because the first connection was not set to read only. The connection is then declared to be targeted at the slaves via setReadOnly(true). The next query will then return the server ID of the particular slave it was balanced to. You might need to run the demo a few times to see a different slave being used because the algorithm that balances the load does not strictly toggle each time, but might direct a few connections against the same slave.

There's more...

While the JDBC driver makes it relatively easy to use read load balancing across several slaves, it only helps you take the first step on the way. You must take care that the application knows which connection to use for write operations and which for read. It must also cope with slaves and master possibly being slightly out of sync all the time. Concentrating that this special logic in a class of its own, is advisable to limit the effect on the rest of the application.

Working with connection pools

When working with connection pooling, be sure to initialize any connection you get to the correct mode using the setReadOnly() method, to be sure you know what state it is in. You might be handed a connection that was set to the wrong mode when it was put back into the pool.

Working on other programming environments

In development environments not using Java, you might have to take care of managing the cycling between slaves yourself. Independent of the actual language or environment you are using, a good practice is to channel all database operations through a set of functions or methods that centrally manage the balancing. You could provide functions that handle INSERT, UPDATE, and DELETE operations, always connecting to the master and a SELECT function going to the slaves for data reads.

In case you need to select something back that you just wrote and cannot allow for the replication lag, you might also provide a read function querying the master machine. You should use this sparingly, however, because it definitely counteracts the intention of relieving the master from the read load.

Considering efficiency while adding slaves

Of course, the slaves have to perform write operations as well to keep up with the master. This means their performance is not fully available for reads, limiting scalability.

So adding more slaves does not proportionally improve performance of the overall system.