In this section, we will leverage JSON as a data format and save our data in JSON. The following topics will be covered:
- Saving data in JSON format
- Loading JSON data
- Testing
This data is human-readable and gives us more meaning than simple plain text because it carries some schema information, such as a field name. We will then learn how to save data in JSON format and load our JSON data.
We will first create a DataFrame of UserTransaction("a", 100) and UserTransaction("b", 200), and use .toDF() to save the DataFrame API:
val rdd = spark.sparkContext
.makeRDD(List(UserTransaction("a", 100), UserTransaction("b", 200)))
.toDF()
We will then issue coalesce() and, this time, we will take the value as 2, and we will have two resulting files. We will then issue the write.format method and, for...