After running the following code, we will have the PimaIndiansDiabetes
R dataframe loaded and we will run the usual str()
and summary()
functions. Note that we need to first install the mlbench
package to retrieve the data that is contained within the package.
At this point, no Spark directives are being introduced. Even though we are running in a databricks environment, the code is pure R, and you can replicate this code in your regular R environment as well.
# load the library devtools::install_github("cran/mlbench") library(mlbench) data(PimaIndiansDiabetes) str(PimaIndiansDiabetes) summary(PimaIndiansDiabetes)
As usual, the str()
and summary()
functions will give you your first insights into the data. The outputs will appear in the console pane, which is typically right below the coding pane.
Note: not all output is shown.