Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Overview of this book

Table of Contents (19 chapters)
Hadoop MapReduce v2 Cookbook Second Edition
Credits
About the Author
Acknowledgments
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Using the HDFS Java API


The HDFS Java API can be used to interact with HDFS from any Java program. This API gives us the ability to utilize the data stored in HDFS from other Java programs as well as to process that data with other non-Hadoop computational frameworks. Occasionally, you may also come across a use case where you want to access HDFS directly from within a MapReduce application. However, if you are writing or modifying files in HDFS directly from a Map or Reduce task, please be aware that you are violating the side-effect-free nature of MapReduce, which might lead to data consistency issues based on your use case.

How to do it...

The following steps show you how to use the HDFS Java API to perform filesystem operations on an HDFS installation using a Java program:

  1. The following sample program creates a new file in HDFS, writes some text in the newly created file, and reads the file back from HDFS:

    import java.io.IOException;
    
    import org.apache.hadoop.conf.Configuration;
    import org...