-
Book Overview & Buying
-
Table Of Contents
Apache Spark 2.x Cookbook
By :
Correlation is a statistical relationship between two variables such that when one variable changes, it leads to a change in the other variable. Correlation analysis measures the extent to which the two variables are correlated.We see correlation in our daily life. The height of a person is correlated with the weight of a person, the load carrying capacity of a truck is correlated with the number of wheels it has, and so on.
If an increase in one variable leads to an increase in another, it is called a positive correlation. If an increase in one variable leads to a decrease in the other, it is a negative correlation.
Spark supports two correlation algorithms: Pearson and Spearman. The Pearson algorithm works with two continuous variables, such as a person's height and weight or house size and house price. Spearman deals with one continuous and one categorical variable, for example, zip code and house price.
Let's use some real data so that we can calculate...
Change the font size
Change margin width
Change background colour