We have discussed the key concepts of big data technologies in the preceding chapters. In this chapter, we will cover the building of big data clusters, frameworks, key components, and the architecture of popular vendors. We will discuss big data DevOps concepts in successive chapters.
We will cover the following topics in this chapter:
- Big data Hadoop ecosystems
- Big data clusters
- Types and application
- High availability
- Load balancing
- Big data nodes
- Master, worker, edge nodes
- Their roles
- Hadoop frameworks
- Cloudera CDH Hadoop distribution
- Hortonworks Data Platform (HDP)
- MapR Hadoop distribution
- Pivotal big data suite
- IBM open platform
- Cloud-based Hadoop distribution
- Amazon Elastic MapReduce
- Microsoft Azure's HDInsight
- Capacity planning
- Factors
- Guidelines