Missing values are values that should have been recorded but, for some reason, weren't actually recorded. Those values are different, from values without meaning, represented in R with NaN (not a number).
Most of us understood missing values due to circumstances such as the following one:
> x <- c(1,2,3,NA,4) > mean(x) [1] NA
"Oh come on, I know you can do it. Just ignore that useless NA" was probably your reaction, or at least it was mine.
Fortunately, R comes packed with good functions for missing value detection and handling.
In this recipe and the following one, we will see two opposite approaches to missing value handling:
Removing missing values
Simulating missing values by interpolation
I have to warn you that removing missing values can be considered right in a really small number of cases, since it compromises the integrity of your data sources and can greatly reduce the reliability of your results.