Book Image

Learning Hadoop 2

Book Image

Learning Hadoop 2

Overview of this book

Table of Contents (18 chapters)
Learning Hadoop 2
About the Authors
About the Reviewers

Command-line access to the HDFS filesystem

Within the Hadoop distribution, there is a command-line utility called hdfs, which is the primary way to interact with the filesystem from the command line. Run this without any arguments to see the various subcommands available. There are many, though; several are used to do things like starting or stopping various HDFS components. The general form of the hdfs command is:

hdfs <sub-command> <command> [arguments]

The two main subcommands we will use in this book are:

  • dfs: This is used for general filesystem access and manipulation, including reading/writing and accessing files and directories

  • dfsadmin: This is used for administration and maintenance of the filesystem. We will not cover this command in detail, though. Have a look at the -report command, which gives a listing of the state of the filesystem and all DataNodes:

    $ hdfs dfsadmin -report


Note that the dfs and dfsadmin commands can also be used with the main Hadoop command...