Recent Hadoop releases have added the security feature by integrating Kerberos into Hadoop. Kerberos is a network authentication protocol that provides strong authentication for client/server applications. Hadoop uses Kerberos to secure data from unexpected and unauthorized accesses. It achieves this by authenticating on the underlying Remote Procedure Calls (RPC). In this recipe, we will outline steps to configure Kerberos authentication for a Hadoop cluster.
Kerberos was created by MIT. It was designed to provide strong authentication for client/server applications by using secret key cryptography. The Kerberos protocol requires that a client provide its identity to the server and vice versa. When their identities are proved by Kerberos, all of their following communication will be encrypted.
Before getting started, we assume that our Hadoop has been properly configured without any problems, and all the Hadoop daemons are running without...