Numeric variables are convenient during statistical analysis, but sometimes we need to create categorical (factor) variables from numeric variables. We can create a limited number of categories from a numeric variable using a series of conditional statements, but this is not an efficient way to perform this operation. In R, cut
is a generic command to create factor
variables from numeric variables. In the following example, we will see how we can create factors from a numeric variable, using a series of conditional statements. We will also use the cut
command to perform the same task.
# creating a numeric variable by taking 100 random numbers # from normal distribution set.seed(1234) # setting seed to reproduce the example numvar <- rnorm(100) # creating factor variable with 5 distinct category num2factor <- cut(numvar,breaks=5) class(num2factor) [1] "factor" levels(num2factor) [1] "(-2.35,-1.37]" "(-1.37,-0.389]" "(-0.389,0.592]" "(0.592,1.57]" ...