For data analytics applications that require Hadoop Distributed File System (HDFS) access, the Ceph object gateway can be accessed using the Apache S3A connector for Hadoop. The S3A connector is an open source tool that presents S3 compatible object storage as an HDFS file system with HDFS file system read and write semantics to the applications while data is stored in the Ceph object gateway.
Ceph object gateway Jewel version 10.2.9 is fully compatible with the S3A connector that ships with Hadoop 2.7.3.
You can use client-node1
to configure Hadoop S3A client.
- Install Java packages in the
client-node1
:
# yum install java* -y
- Download the Hadoop
.tar
file from https://archive.apache.org/dist/hadoop/core/hadoop-2.7.3/hadoop-2.7.3.tar.gz:
- Extract the Hadoop
.tar
file:
# tar -xvf hadoop-2.7.3.tar.gz
- Add the following in the
.bashrc
file:
export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk export ...