When handling sensitive data, it is always important to consider the security measures. Hadoop allows us to encrypt sensitive data that's present in HDFS. In this recipe, we are going to see how to encrypt data in HDFS.
For many applications that hold sensitive data, it is very important to adhere to standards such as PCI, HIPPA, FISMA, and so on. To enable this, HDFS provides a utility called encryption zone in which we can create a directory so that data is encrypted on writes and decrypted on read.
To use this encryption facility, we first need to enable Hadoop Key Management Server (KMS):
/usr/local/hadoop/sbin/kms.sh start
This would start KMS in the Tomcat web server.
Next, we need to append the following properties in core-site.xml
and hdfs-site.xml
.
In core-site.xml
, add the following property:
<property> <name>hadoop.security.key...