In this recipe, we demonstrate One-vs-Rest in Apache Spark 2.0. What we are trying to achieve with the OneVsRest()
classifier is to make a binary logistic regression to work for a multi-class / multi-label classification problem. The recipe is a two-step approach in which we first configure a LogisticRegression()
object and then use it in a OneVsRest()
classifier to solve a multi-class classification problem using logistic regression.
- Go to the
LIBSVM
Data: Classification (Multi-class) Repository and download the file: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass/iris.scale
- Start a new project in IntelliJ or in an IDE of your choice. Make sure the necessary JAR files are included.
- Set up the package location where the program will reside:
package spark.ml.cookbook.chapter5
- Import the necessary packages for the
SparkSession
to gain access to the cluster andLog4j.Logger
to reduce the amount of output produced...