Using random imputation to match a variable's distribution
This recipe imputes missing values with actual values (selected at random) from the variable with missing values needing to be imputed. It is valuable when one does not want to impute with a constant but the variable has a distribution that isn't replicated well by a normal or uniform random imputation method.
In this recipe we will impute values for a missing or blank variable with a random value from the variable's own known values. This random imputation will therefore match the actual distribution of the variable itself.
Getting ready
This recipe uses the following files:
Datafile:
cup98lrn_variable cleaning random impute recipe.sav
Stream file:
Recipe – impute missing with actual values.str
How to do it...
Open the stream (
Recipe – impute missing with actual values.str
) by navigating to File | Open Stream.Make sure the datafile points to the correct path and to the datafile (
cup98lrn_variable cleaning random impute recipe.sav
).Open...