HBase has two run modes—standalone mode and distributed mode. Standalone mode is the default mode of HBase. In standalone mode, HBase uses a local filesystem instead of HDFS, and runs all HBase daemons and an HBase-managed ZooKeeper instance, all in the same JVM.
This recipe describes the setup of a standalone HBase. It leads you through installing HBase, starting it in standalone mode, creating a table via HBase Shell, inserting rows, and then cleaning up and shutting down the standalone HBase instance.
You are going to need a Linux machine to run the stack. Running HBase on top of Windows is not recommended. We will use Debian 6.0.1 (Debian Squeeze) in this book, because we have several Hadoop/HBase clusters running on top of Debian in production at my company, Rakuten Inc., and 6.0.1 is the latest Amazon Machine Image (AMI) we have, at http://wiki.debian.org/Cloud/AmazonEC2Image.
As HBase is written in Java, you will need to have Java installed first. HBase runs on Oracle's JDK only, so do not use OpenJDK for the setup. Although Java 7 is available, we don't recommend you to use Java 7 now because it needs more time to be tested. You can download the latest Java SE 6 from the following link: http://www.oracle.com/technetwork/java/javase/downloads/index.html.
Execute the downloaded bin
file to install Java SE 6. We will use /usr/local/jdk1.6
as JAVA_HOME
in this book:
root# ln -s /your/java/install/directory /usr/local/jdk1.6
We will add a user with the name hadoop
, as the owner of all HBase/Hadoop daemons and files. We will have all HBase files and data stored under /usr/local/hbase:
root# useradd hadoop
root# mkdir /usr/local/hbase
root# chown hadoop:hadoop /usr/local/hbase
Get the latest stable HBase release from HBase's official site, http://www.apache.org/dyn/closer.cgi/hbase/. At the time of writing this book, the current stable release was 0.92.1.
You can set up a standalone HBase instance by following these instructions:
1. Download the tarball and decompress it to our root directory for HBase. We will set an
HBASE_HOME
environment variable to make the setup easier, by using the following commands:root# su - hadoop hadoop$ cd /usr/local/hbase hadoop$ tar xfvz hbase-0.92.1.tar.gz hadoop$ ln -s hbase-0.92.1 current hadoop$ export HBASE_HOME=/usr/local/hbase/current
2. Set
JAVA_HOME
in HBase's environment setting file, by using the following command:hadoop$ vi $HBASE_HOME/conf/hbase-env.sh # The java implementation to use. Java 1.6 required. export JAVA_HOME=/usr/local/jdk1.6
3. Create a directory for HBase to store its data and set the path in the HBase configuration file (
hbase-site.xml
), between the<configuration>
tag, by using the following commands:hadoop$ mkdir -p /usr/local/hbase/var/hbase hadoop$ vi /usr/local/hbase/current/conf/hbase-site.xml <property> <name>hbase.rootdir</name> <value>file:///usr/local/hbase/var/hbase</value> </property>
4. Start HBase in standalone mode by using the following command:
hadoop$ $HBASE_HOME/bin/start-hbase.sh starting master, logging to /usr/local/hbase/current/logs/hbase-hadoop-master-master1.out
5. Connect to the running HBase via HBase Shell, using the following command:
hadoop$ $HBASE_HOME/bin/hbase shell HBase Shell; enter 'help<RETURN>' for list of supported commands. Type "exit<RETURN>" to leave the HBase Shell Version 0.92.1, r1298924, Fri Mar 9 16:58:34 UTC 2012
6. Verify HBase's installation by creating a table and then inserting some values. Create a table named
test
, with a single column family namedcf1
, as shown here:hbase(main):001:0> create 'test', 'cf1' 0 row(s) in 0.7600 seconds
i. In order to list the newly created table, use the following command:
hbase(main):002:0> list TABLE test 1 row(s) in 0.0440 seconds
ii. In order to insert some values into the newly created table, use the following commands:
hbase(main):003:0> put 'test', 'row1', 'cf1:a', 'value1' 0 row(s) in 0.0840 seconds hbase(main):004:0> put 'test', 'row1', 'cf1:b', 'value2' 0 row(s) in 0.0320 seconds
7. Verify the data we inserted into HBase by using the
scan
command:hbase(main):003:0> scan 'test' ROW COLUMN+CELL row1 column=cf1:a, timestamp=1320947312117, value=value1 row1 column=cf1:b, timestamp=1320947363375, value=value2 1 row(s) in 0.2530 seconds
8. Now clean up all that was done, by using the
disable
anddrop
commands:i. In order to disable the table test, use the following command:
hbase(main):006:0> disable 'test' 0 row(s) in 7.0770 seconds
ii. In order to drop the the table test, use the following command:
hbase(main):007:0> drop 'test' 0 row(s) in 11.1290 seconds
9. Exit from HBase Shell using the following command:
hbase(main):010:0> exit
10. Stop the HBase instance by executing the
stop
script:
hadoop$ /usr/local/hbase/current/bin/stop-hbase.sh
stopping hbase.......
We installed HBase 0.92.1 on a single server. We have used a symbolic link named current
for it, so that version upgrading in the future is easy to do.
In order to inform HBase where Java is installed, we will set JAVA_HOME
in hbase-env.sh
, which is the environment setting file of HBase. You will see some Java heap and HBase daemon settings in it too. We will discuss these settings in the last two chapters of this book.
In step 1, we created a directory on the local filesystem, for HBase to store its data. For a fully distributed installation, HBase needs to be configured to use HDFS, instead of a local filesystem. The HBase master daemon (HMaster) is started on the server where start-hbase.sh
is executed. As we did not configure the region server here, HBase will start a single slave daemon (HRegionServer) on the same JVM too.
As we mentioned in the Introduction section, HBase depends on ZooKeeper as its coordination service. You may have noticed that we didn't start ZooKeeper in the previous steps. This is because HBase will start and manage its own ZooKeeper ensemble, by default.
Then we connected to HBase via HBase Shell. Using HBase Shell, you can manage your cluster, access data in HBase, and do many other jobs. Here, we just created a table called test
, we inserted data into HBase, scanned the test
table, and then disabled and dropped it, and exited the shell.
HBase can be stopped using its stop-hbase.sh
script. This script stops both HMaster and HRegionServer daemons.