DataFrames can easily be manipulated with SQL queries in Spark.
In this recipe, we will learn how to create a temporary view so you can access the data within DataFrame using SQL.
To execute this recipe, you need to have a working Spark 2.3 environment. You should have gone through the previous recipe, as we will be using the sample_data_schema
DataFrame we created there.
There are no other requirements.
We simply use the .createTempView(...)
method of a DataFrame:
sample_data_schema.createTempView('sample_data_view')
The .createTempView(...)
method is the simplest way to create a temporary view that later can be used to query the data. The only required parameter is the name of the view.
Let's see how such a temporary view can now be used to extract data:
spark.sql(''' SELECT Model , Year , RAM , HDD FROM sample_data_view ''').show()
We simply use the .sql(...)
method of SparkSession, which allows...