Book Image

HDInsight Essentials - Second Edition

By : Rajesh Nadipalli
Book Image

HDInsight Essentials - Second Edition

By: Rajesh Nadipalli

Overview of this book

Table of Contents (16 chapters)
HDInsight Essentials Second Edition
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Challenges on building a production Data Lake


Most organizations start with a short proof of concept (POC) that demonstrates the value of big data and the Hadoop ecosystem. These are primarily executed from a research perspective with specific datasets and goals and are generally successful.

After the proof of technology (POT) readout is when management has the following key questions that block further progress:

  • Do we have the development skills to handle this technology on a large scale?

  • How do we integrate this Hadoop Data Lake with current systems?

  • How do we secure data in Hadoop and meet compliance requirements?

  • Can the current operations team manage this in production?

These are tough questions and require people, process, and technology to transition so that the organization can leap forward to a modern Data Lake architecture. In the next section, we will review a few key steps for a successful Data Lake implementation.