Book Image

Building a Big Data Analytics Stack [Video]

By : Tomasz Lelek
Book Image

Building a Big Data Analytics Stack [Video]

By: Tomasz Lelek

Overview of this book

<p><span id="description" class="sugar_field">Building a Big Data ecosystem is hard. There are a variety of technologies available and every one of them has its pros and cons. When building a big data pipeline for software engineers, we need to use more low-level tools and APIs such as HBase and Apache Spark.</span></p> <p><span id="description" class="sugar_field">In this course, we’ll check out HBase, a database built by optimizing on the HDFS. Moving on, we’ll have a bit of fun with Spark MLlib. Finally, you’ll get an understanding of ETL and deploy a Hadoop project to the cloud. Building Big Data Ecosystem is hard. There are a variety of technologies available and every one of them has own pros and cons. Software Engineers we need to use more low-level tools and APIs like HBase and Apache Spark while building big data pipeline. </span></p> <p><span id="description" class="sugar_field">By the end of the course, you’ll be able to use more high-level tools that have more user-friendly, declarative APIs such as Pig and Hive.</span></p> <h2><span class="sugar_field">Style and Approach</span></h2> <p><span class="sugar_field"><span id="trade_selling_points_c" class="sugar_field">This course will give you both a knowledge-based understanding and practical hands-on experience of Hadoop 2.7. It also looks at Spark, Pig, Hive, HBase, and YARN, so you can understand how to implement these components while using Hadoop clusters.</span></span></p>
Table of Contents (5 chapters)
Chapter 2
Spark Your Engines
Content Locked
Section 1
Writing Spark Jobs
In this video, we will be writing Spark jobs. - Explore the Spark architecture and RDD - Learn about partitioning in Spark - Explore transformations and actions