Book Image

Cloudera Administration Handbook

By : Rohit Menon
Book Image

Cloudera Administration Handbook

By: Rohit Menon

Overview of this book

Table of Contents (17 chapters)
Cloudera Administration Handbook
Credits
Notice
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

The read/write operational flow in HDFS


To get a better understanding of HDFS, we need to understand the flow of operations for the following two scenarios:

  • A file is written to HDFS

  • A file is read from HDFS

HDFS uses a single-write, multiple-read model, where the files are written once and read several times. The data cannot be altered once written. However, data can be appended to the file by reopening it. All files in the HDFS are saved as data blocks.

Writing files in HDFS

The following sequence of steps occur when a client tries to write a file to HDFS:

  1. The client informs the namenode daemon that it wants to write a file. The namenode daemon checks to see whether the file already exists.

  2. If it exists, an appropriate message is sent back to the client. If it does not exist, the namenode daemon makes a metadata entry for the new file.

  3. The file to be written is split into data packets at the client end and a data queue is built. The packets in the queue are then streamed to the datanodes in the...