Book Image

Couchbase Essentials

Book Image

Couchbase Essentials

Overview of this book

Table of Contents (15 chapters)
Couchbase Essentials
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Couchbase architecture


Before we move on to developing with Couchbase, it's useful to understand the general Couchbase architecture. While coding against a single-node cluster should generally be no different than coding against a 10-node cluster, supporting a production application does require deeper understanding of what could go wrong, as your application needs to scale out. In the following sections, I'll describe in more detail some of the concepts we've already seen, and the basics of how a Couchbase cluster operates.

Couchbase clusters

Fundamental to all Couchbase deployments is the notion of a cluster. This is a common term in the NoSQL world and generally refers to a collection of nodes performing operations on a data store in tandem. However, how nodes in a cluster behave varies significantly across NoSQL products. In some systems, all nodes are peers, with no differences. In others, clusters are set up in master-slave configurations.

In a Couchbase cluster, nodes are interchangeable. Each node contains a cluster manager, which is responsible for knowing the status of other nodes in the cluster, and for allowing other nodes to know its status. As each node has its own cluster manager component, this allows Couchbase Server to scale out linearly with no single point of failure.

Replication

One of the most important tasks of the cluster manager is to ensure that all of the data is available to clients. Couchbase Server replication works by making one node the master node for a given document, while up to three slave nodes maintain a replica of that document. In case the cluster manager detects a node failure, it is responsible for promoting replicas to the primary node.

Balancing and rebalancing

Sharding is the notion of distributing data evenly across the nodes of a cluster. In most sharded systems, the admin is responsible for picking a shard key to be used for data distribution. For example, a Users table might be sharded on a Username field. If the shard key turns out to be poorly distributed (imagine 30 percent of users having usernames starting with T), then the nodes will not be well balanced.

Couchbase, in contrast, is auto-sharded and guarantees balance. Recall that Couchbase documents are stored using a key/value approach. Though the user supplies the key, Couchbase SDKs use a strong and cryptographic hash on each key to guarantee that keys will be evenly distributed across a cluster. This hashing action considers the topology of the cluster, which means that whether there are 2 or 20 nodes, the keys will still be balanced.

Even though the SDKs and the server work together to ensure proper sharding, in case a node (or nodes) goes offline, that balance will temporarily be broken. This is because replicas are promoted. As nodes are added or removed from a cluster, the cluster manager will work to rebalance the data across the nodes. A newly added node may not be ready to fully join the cluster until a rebalance has been performed. As alluded to earlier, this task may be done using the Couchbase Console.