Book Image

Spark Cookbook

By : Rishi Yadav
Book Image

Spark Cookbook

By: Rishi Yadav

Overview of this book

Table of Contents (19 chapters)
Spark Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Loading and saving data from relational databases


In the previous chapter, we learned how to load data from a relational data into an RDD using JdbcRDD. Spark 1.4 has support to load data directly into Dataframe from a JDBC resource. This recipe will explore how to do it.

Getting ready

Please make sure that JDBC driver JAR is visible on the client node and all the slaves nodes on which executor will run.

How to do it...

  1. Create a table named person in MySQL using the following DDL:

    CREATE TABLE 'person' (
      'person_id' int(11) NOT NULL AUTO_INCREMENT,
      'first_name' varchar(30) DEFAULT NULL,
      'last_name' varchar(30) DEFAULT NULL,
      'gender' char(1) DEFAULT NULL,
      'age' tinyint(4) DEFAULT NULL,
      PRIMARY KEY ('person_id')
    )
  2. Insert some data:

    Insert into person values('Barack','Obama','M',53);
    Insert into person values('Bill','Clinton','M',71);
    Insert into person values('Hillary','Clinton','F',68);
    Insert into person values('Bill','Gates','M',69);
    Insert into person values('Michelle','Obama','F...