Like the previous recipe, this recipe finds interesting phrases, but it uses another language model to determine what is interesting. Amazon's statistically improbable phrases (SIP) work this way. You can get a clear view from their website at http://www.amazon.com/gp/search-inside/sipshelp.html:
"Amazon.com's Statistically Improbable Phrases, or "SIPs", are the most distinctive phrases in the text of books in the Search Inside!™ program. To identify SIPs, our computers scan the text of all books in the Search Inside! program. If they find a phrase that occurs a large number of times in a particular book relative to all Search Inside! books, that phrase is a SIP in that book.
SIPs are not necessarily improbable within a particular book, but they are improbable relative to all books in Search Inside!."
The foreground model will be the book being processed, and the background model will be all the other books in Amazon's Search Inside...