One way to normalize values is to scale frequencies by the size of their groups. For example, say the word truth appears three times in a document. That means one thing if the document has 30 words. It means something else if the document has 300 words, or 3,000. And if the dataset has documents of all those lengths, how do we compare the frequencies for words across documents?
The answer is—we rescale the frequency counts. In some cases we could just scale the terms by the length of the documents. Or if we wanted better results, we might use something more complicated like tf-idf (term frequency-inverse document frequency). Wikipedia has a good overview of this technique at http://en.wikipedia.org/wiki/Tf-idf.
For this recipe, we'll rescale some term frequencies by the total word count for their document.