The following list of references have been used for the various topics of this chapter. You might want to go through these specific sections to get more detailed understanding of individual sections.
- https://databricks.com/blog/2016/07/14/a-tale-of-three-apache-spark-apis-rdds-dataframes-and-datasets.html
- https://databricks.com/blog/2016/08/15/how-to-use-sparksession-in-apache-spark-2-0.html
- https://parquet.apache.org/
- Setting up SparkR: https://www.youtube.com/watch?v=A5cBAPoidsg
- http://spark.apache.org/docs/latest/sql-programming-guide.html
- Structured Streaming: https://www.youtube.com/watch?v=1a4pgYzeFwE&feature=youtu.be
- Catalyst Optimizer:https://www.youtube.com/watch?v=UBeewFjFVnQ&t=39s