One of the key tasks in designing a neural network application is to select appropriate inputs. For the unsupervised case, one wishes to use only relevant variables on which the neural network will find the patterns. For the supervised case, there is a need to map the outputs to the inputs, so one needs to choose only the input variables that somewhat influence the output.
One strategy that helps in selecting good inputs in the supervised case is the correlation between data series. A correlation between data series is a measure of how one data sequence reacts or influences the other. Suppose that we have one dataset containing a number of data series from which we choose one to be an output. Now, we need to select the inputs from the remaining variables.
We then evaluate the influence of one variable at a time on the output in order to decide whether to include it as an input or not. The Pearson coefficient is one of the most used variables:
Where Sx(k)y(k)...