There are many ways to deal with missing records. The simplest one is to delete them. This is especially true when we have a relative large dataset. One potential issue is that our final dataset should not be changed in any fundamental way after we delete the missing data. In other words, if the missing records happened in a random way, then simply deleting them would not generate a biased result.
Dealing with missing data
Removing missing data
The following R program uses the na.omit() function:
> x<-c(NA,1,2,50,NA) > y<-na.omit(x) > mean(x) [1] NA > mean(y) [1] 17.66667
Another R function called na.exclude() could be used as well. The following Python program removes all sp.na code:
import scipy as...