Book Image

Getting Started with Memcached

By : Ahmed Soliman
Book Image

Getting Started with Memcached

By: Ahmed Soliman

Overview of this book

<p>Web application performance is no longer a non-functional requirement, but an implicit condition for an engaging user experience. As a result, responsive and highly scalable applications are becoming a necessity. Memcached is a high-performance distributed memory caching system built to speed up dynamic web applications by offloading pressure from your database. <br /><br />Getting Started with Memcached is a hands-on, comprehensive guide to the Memcached service and it’s API in different programming languages. It contains practical recipes to integrate Memcached within your Rails, Django, or even Scala Play! applications.<br /><br />This book will show you everything you need to know to start using Memcached in your existing or new web applications.<br />This book uses real-world recipes to help you learn how to store and retrieve data from your clustered virtual memory cache pool and how to integrate caching into your favourite web development framework.</p> <p><br />You will also learn how to build a Memcached consistent-hashing scalable cluster and how Memcached clients are properly configured to use different servers to scale out your memory cache pool in Ruby, Python, PHP, and Java. With this book, you will see how to cache templates and database queries in the most popular web development framework in use today.</p>
Table of Contents (9 chapters)

Setting up distributed memcached (Intermediate)


One of the most common use cases of using memcached is to build a distributed cache environment over multiple machines in a cluster. The setup allows you to scale up memcached horizontally by adding more machines to a cluster, you expand the total memory available for your application as a cache. The benefit of having a horizontally scalable caching, is that you are not limited by the amount of RAM you can install in a single server any more. It also means that you can utilize some of the free memory you have in your web server or so, and collectively you will have a distributed memcached environment with a large single virtual memory pool for your caching needs.

Building a distributed memcached environment is far simpler than you might have thought. The memcached daemon is blind about the cluster setup and has no special configuration on the server side to run the cluster, the client is actually doing the data distribution not the server.

Getting ready

So, it all starts when a single server cannot hold your entire cache and you need to split the cache pool across several servers.

If you are running multiple instances of the memcached daemon on the same server, make sure you are running them on different ports.

memcached -p 3030
memcached -p 3031

How to do it...

The server installation goes as previously described and the cluster configuration goes to your client by adding the list of servers to all your clients.

It's important to note that in order to ensure that the cluster is sane, is to have the same order of servers in all of your clients.

As an example, I'll be using python's pylibmc library to communicate with the memcached cluster:

import pylibmc
mc = pylibmc.Client(["127.0.0.1:3030", "127.0.0.1:3031"], binary=True, behaviors={"tcp_nodelay": True, "ketama": True})
mc["ahmed"] = "Hello World"
mc["tek"] = "Hello World"

How it works...

What happens is that you specify a list of your servers to your client configuration and the client library uses consistent hashing to decide which server a certain key-value should go to.

The constructor of the client object here was fed with a couple of interesting parameters:

  • binary = True: This is to configure pylibmc to use the memcached binary protocol not the ASCII protocol.

  • behaviors={"tcp_nodelay": True, "ketama": True}: This configures the memcached connection socket to use the tcp_nodelay socket option which disables Nagle's algorithm (http://en.wikipedia.org/wiki/Nagle%27s_algorithm) on the socket level. Setting "ketama" = True means that pylibmc is using md5 hashing and that it's using consistent hashing for key distribution.

    Note

    The consistent hashing algorithm relies on the order of the list of the servers, so you need to have all your clients in-sync with the same configuration list in exact order.

After we have created the client object, we have set two keys ahmed and tek with the value Hello World and what actually happens behind the scenes is that each key-value pair is actually stored on a different daemon, according to the consistent hashing of the key.

Caching with persistence

Sometimes you want your caching server to be persistent; there are several very good alternatives to memcached that can help you achieve that.

You can checkout Redis at http://redis.io and Kyoto Tycoon at http://fallabs.com/kyototycoon/.