Book Image

Learning Hunk

By : Dmitry Anoshin, Sergey Sheypak
Book Image

Learning Hunk

By: Dmitry Anoshin, Sergey Sheypak

Overview of this book

Hunk is the big data analytics platform that lets you rapidly explore, analyse, and visualize data in Hadoop and NoSQL data stores. It provides a single, fluid user experience, designed to show you insights from your big data without the need for specialized skills, fixed schemas, or months of development. Hunk goes beyond typical data analysis methods and gives you the power to rapidly detect patterns and find anomalies across petabytes of raw data. This book focuses on exploring, analysing, and visualizing big data in Hadoop and NoSQL data stores with this powerful full-featured big data analytics platform. You will begin by learning the Hunk architecture and Hunk Virtual Index before moving on to how to easily analyze and visualize data using Splunk Search Language (SPL). Next you will meet Hunk Apps which can easy integrate with NoSQL data stores such as MongoDB or Sqqrl. You will also discover Hunk knowledge objects, build a semantic layer on top of Hadoop, and explore data using the friendly user-interface of Hunk Pivot. You will connect MongoDB and explore data in the data store. Finally, you will go through report acceleration techniques and analyze data in the AWS Cloud.
Table of Contents (14 chapters)

Integrating Hunk with EMR and S3


Integrating Hunk with EMR and S3 is a pretty sensible proposition. If we connect the vast amounts of data that we store in HDFS or S3 with the rich capabilities of Hunk, we can build a full analytics solution for any type of data and any size of data on the cloud:

Fundamentally, we have a three-tier architecture. The first tier is data storage based on HDFS or S3. The next one is the compute or processing framework, provided by EMR. Finally, the visualization, data discovery, analytics, and app development framework is provided by Hunk.

The traditional method for hosting Hunk in the cloud is to simply buy a standard license and then provision a virtual machine in much the same way you would do it on-site. The instance would then have to be manually configured to point to the correct Hadoop or AWS cluster. This method is also called Bring Your Own License (BYOL).

On the other hand, Splunk and Amazon offer another method, in which Hunk instances can be automatically...