Book Image

Spark Cookbook

By : Rishi Yadav
Book Image

Spark Cookbook

By: Rishi Yadav

Overview of this book

Table of Contents (19 chapters)
Spark Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Chapter 4. Spark SQL

Spark SQL is a Spark module for processing a structured data. This chapter is divided into the following recipes:

  • Understanding the Catalyst optimizer

  • Creating HiveContext

  • Inferring schema using case classes

  • Programmatically specifying the schema

  • Loading and saving data using the Parquet format

  • Loading and saving data using the JSON format

  • Loading and saving data from relational databases

  • Loading and saving data from an arbitrary source