In the previous recipe, we learned how to infer the schema of a DataFrame using reflection.
In this recipe, we will learn how to specify the schema programmatically.
To execute this recipe, you need to have a working Spark 2.3 environment.
There are no other requirements.
In this example, we will learn how to specify the schema programmatically:
import pyspark.sql.types as typ sch = typ.StructType([ typ.StructField('Id', typ.LongType(), False) , typ.StructField('Model', typ.StringType(), True) , typ.StructField('Year', typ.IntegerType(), True) , typ.StructField('ScreenSize', typ.StringType(), True) , typ.StructField('RAM', typ.StringType(), True) , typ.StructField('HDD', typ.StringType(), True) , typ.StructField('W', typ.DoubleType(), True) , typ.StructField('D', typ.DoubleType(), True) , typ.StructField('H', typ.DoubleType(), True) , typ.StructField('Weight', typ.DoubleType(), True...