Book Image

Learning Hadoop 2

By : Gerald Turkington, GABRIELE MODENA
Book Image

Learning Hadoop 2

By: Gerald Turkington, GABRIELE MODENA

Overview of this book

Table of Contents (18 chapters)
Learning Hadoop 2
About the Authors
About the Reviewers

Kite Data

The Kite SDK ( is a collection of classes, command-line tools, and examples that aims at easing the process of building applications on top of Hadoop.

In this section we will look at how Kite Data, a subproject of Kite, can ease integration with several components of a Hadoop data warehouse. Kite examples can be found at

On Cloudera's QuickStart VM, Kite JARs can be found at /opt/cloudera/parcels/CDH/lib/kite/.

Kite Data is organized in a number of subprojects, some of which we'll describe in the following sections.

Data Core

As the name suggests, the core is the building block for all capabilities provided in the Data module. Its principal abstractions are datasets and repositories.

The interface is used to represent an immutable set of data:

public interface Dataset<E> extends RefinableView<E> {
  String getName();
  DatasetDescriptor getDescriptor();