Book Image

High Availability MySQL Cookbook

By : Alexander Davies
Book Image

High Availability MySQL Cookbook

By: Alexander Davies

Overview of this book

High Availability is something that all web sites hope to achieve, especially those that are linked to big companies.MySQL, an open source relational database management system (RDBMS), can be made highly available to protect from corruption, hardware failure, software crashes, and user error. Running a MySQL setup is quite simple. Things start getting complex when you start thinking about the best way to provide redundancy. There are a large number of techniques available to add 'redundancy' and 'high availability' to MySQL, but most are both poorly understood and documented.This book will provide you with recipes showing how to design, implement, and manage a MySQL Cluster and achieve high availability using MySQL replication, block level replication, shared storage, and the open source Global File System (GFS).This book covers all the major techniques available for increasing availability of your MySQL databases. It demonstrates how to design, implement, troubleshoot and manage a highly available MySQL setup using any one of several techniques, which are shown in different recipes. It is based on MySQL Cluster 7.0, MySQL (for non clustered recipes) 5.0.77, and CentOS / RedHat Enterprise Linux 5.3.The book starts by introducing MySQL Cluster as a technology and explaining how to set up a simple cluster. It will help you to master the options available for backing up and restoring a file in the MySQL Cluster. By following the practical examples in this book, you will learn how to manage the MySQL Cluster. Further, we will discuss some troubleshooting aspects of the MySQL Cluster.We also have a look at achieving high availability for MySQL databases with the techniques of MySQL Replication, block level replication, shared storage (a SAN or NAS), and DRBD.Finally, you will learn the principles of Performance tuning and tune MySQL database for optimal performance.
Table of Contents (17 chapters)
High Availability MySQL Cookbook
Credits
About the Author
About the Reviewers
Preface
Base Installation
Index

Creating an initial cluster configuration file—config.ini


In this recipe, we will discuss the initial configuration required to start a MySQL Cluster. A MySQL Cluster has a global configuration file—config.ini, which resides on all management nodes. This file defines the nodes (processes) that make up the cluster and the parameters that the nodes will use.

Each management node, when it starts, reads the config.ini file to get information on the structure of the cluster and when other nodes (storage and SQL / API) start, they contact the already-running management node to obtain the details of the cluster architecture.

The creation of this global configuration file—config.ini, is the first step in building the cluster and this recipe looks at the initial configuration for this file. Later recipes will cover more advanced parameters which you can define (typically to tune a cluster for specific goals, such as performance).

How to do it…

The first step in building a cluster is to create a global cluster configuration file. This file, called config.ini, by convention, is stored on each management node and is used by the management node process to show the cluster makeup and define variables for each node. In our example, we will store this in the file /usr/local/mysql-cluster/config.ini, but it can be stored anywhere else.

The file consists of multiple sections. Each section contains parameters that apply to a particular node, for example, the node's IP address or the amount of memory to reserve for data. Each type of node (management, SQL, and data node) has an optional default section to save duplicating the same parameter in each node. Each individual node that will make up the cluster has its own sections, which inherits the defaults defined for its type and specifies the additional parameters, or overrides the defaults.

This global configuration file is not complex, but is best analyzed with an example, and in this recipe, we will create a simple cluster configuration file for this node. The first line to add in the config.ini file is a block for this new management node:

[ndb_mgmd]

Now, we specify an ID for the node. This is absolutely not required, but can be useful—particularly if you have multiple management nodes.

Id=1

Now, we specify the IP address or hostname of the management node. It is recommended to use IP addresses in order to avoid a dependency on the DNS:

HostName=10.0.0.5

Note

It is possible to define a node without an IP address, in this case, a starting node can either be told which nodeID it should take when it starts, or the management node will allocate the node to the most suitable free slot.

Finally, we define a directory to store local files (for example, cluster log files):

DataDir=/var/lib/mysql-cluster

This is all that is required to define a single management node.

Now, we define the storage nodes in our simple cluster. To add storage nodes, it is recommended that we use the default section to define a data directory (a place for the node to store the files, which the node stores on the disk). It is also mandatory to define the NoOfReplicas parameter, which was discussed in the There's more… section of the previous recipe.

[<type>_default] works for all three types of nodes (mgmd, ndbd, and mysqld) and defines a default value to save duplicating a parameter for every node in that section. For example, the DataDir of the storage nodes can (and should) be defined in the default section:

[ndbd_default]
DataDir=/var/lib/mysql-cluster
NoOfReplicas=2

Once we have defined the default section, then defining the node ID and IP / hostname for the other storage nodes that make up our cluster is a simple matter as follows:

[ndbd]
id=3
HostName=10.0.0.1

[ndbd]
id=4
HostName=10.0.0.2

Note

You can either use hostnames or IP addresses in config.ini file. I recommend that you use IP addresses for absolute clarity, but if hostnames are used, it is a good idea to ensure that they are hardcoded in /etc/hosts on each node in order to ensure that a DNS problem does not cause major issues with your cluster.

Finally for SQL nodes, it is both possible and common to simply define a large number of [mysqld] sections with no HostName parameter. This keeps the precise future structure of the cluster flexible (this is not generally recommended for storage and management nodes).

It is a good practice to define the hostnames for essential nodes, and if desired also leave some spare sections for future use (the recipe Taking an online backup of a MySQL Cluster later in Chapter 2, MySQL Cluster Backup and Recovery will explain one of several most common reasons why this will be useful). For example, to define two-cluster SQL nodes (with their servers running mysqld) with IP addresses 10.0.0.2 and 10.0.0.3, with two more slots available for any SQL or API nodes to connect to on a first come, first served basis, use the following:

[mysqld]
HostName=10.0.0.2

[mysqld]
HostName=10.0.0.3

[mysqld]

[mysqld]

Now that we have prepared a simple config.ini file for a cluster, it is potentially possible to move on to installing and starting the cluster's first management node. Recollect where we saved the file (in /usr/local/mysql-cluster/config.ini) as you will need this information when you start the management node for the first time.

There's more…

At this stage, we have not yet defined any advanced parameters. It is possible to use the config.ini file that we have written so far to start a cluster and import a relatively small testing data set (such as the world database provided by MySQL for testing, which we will use later in this book). However, it is likely that you will need to set a couple of other parameters in the ndbd_default section of the config.ini file before you get a cluster in which you can actually import anything more than a tiny amount of data.

Firstly, there is a maximum limit of 32,000 concurrent operations in a MySQL Cluster, by default. The variable MaxNoOfConcurrentOperations sets the number of records that can be in update phase or locked simultaneously. While this sounds like a lot, it is likely that any significant import of data will exceed this value, so this can be safely increased. The limit is set deliberately low to protect small systems from large transactions. Each operation consumes at least one record, which has an overhead of 1 KB of memory.

Note

The MySQL documentation states the following:

Unfortunately, it is difficult to calculate an exact value for this parameter so set it to a sensible value depending on the expected load on the cluster and monitor for errors when running large transactions (often when importing data):

MaxNoOfConcurrentOperations = 150000

A second extremely common limit to increase is the maximum number of attributes (fields, indexes, and so on) in a cluster which defaults to 1000. This is also quite low, and in the same way it can normally be increased:

MaxNoOfAttributes = 10000

The maximum number of ordered indexes is low and if you reach it, it will return a slightly cryptic error, Can't create table xxx (errno: 136). Therefore, it is often worth increasing it at the start, if you plan on having a total of more than 128 ordered indexes in your cluster:

MaxNoOfOrderedIndexes=512

Finally, it is almost certain that you will need to define some space for data and indexes on your storage nodes. Note that you should not allocate more storage space than you have to spare on the physical machines running the storage nodes, as a cluster swapping is likely to happen and the cluster will crash!

DataMemory=2G
IndexMemory=500M

With these parameters set, you are ready to start a cluster and import a certain amount of data in it.