Book Image

Spark Cookbook

By : Rishi Yadav
Book Image

Spark Cookbook

By: Rishi Yadav

Overview of this book

Table of Contents (19 chapters)
Spark Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Creating HiveContext


SQLContext and its descendant HiveContext are the two entry points into the world of Spark SQL. HiveContext provides a superset of functionality provided by SQLContext. The additional features are:

  • More complete and battle-tested HiveQL parser

  • Access to Hive UDFs

  • Ability to read data from Hive tables

From Spark 1.3 onwards, the Spark shell comes loaded with sqlContext (which is an instance of HiveContext not SQLContext). If you are creating SQLContext in Scala code, it can be created using SparkContext, as follows:

val sc: SparkContext
val sqlContext = new org.apache.spark.sql.SQLContext(sc)

In this recipe, we will cover how to create instance of HiveContext, and then access Hive functionality through Spark SQL.

Getting ready

To enable Hive functionality, make sure that you have Hive enabled (-Phive) assembly JAR is available on all worker nodes; also, copy hive-site.xml into the conf directory of the Spark installation. It is important that Spark has access to hive-site.xml...