SQLContext
and its descendant HiveContext
are the two entry points into the world of Spark SQL. HiveContext
provides a superset of functionality provided by SQLContext. The additional features are:
More complete and battle-tested HiveQL parser
Access to Hive UDFs
Ability to read data from Hive tables
From Spark 1.3 onwards, the Spark shell comes loaded with sqlContext (which is an instance of HiveContext
not SQLContext
). If you are creating SQLContext
in Scala code, it can be created using SparkContext
, as follows:
val sc: SparkContext val sqlContext = new org.apache.spark.sql.SQLContext(sc)
In this recipe, we will cover how to create instance of HiveContext
, and then access Hive functionality through Spark SQL.