Usually in natural language processing, some uninformative words or characters, called stop words, can be filtered out for easier handling. When computing word frequencies or extracting sentiment data from a corpus, punctuation or special characters might need to be ignored. This recipe demonstrates how to remove these specific characters from the body of a text.
There are no imports necessary. Create a new file, which we will call Main.hs
, and perform the following steps:
Implement
main
and define a string calledquote
. The back slashes (\
) represent multiline strings:main :: IO () main = do let quote = "Deep Blue plays very good chess-so what?\ \Does that tell you something about how we play chess?\ \No. Does it tell you about how Kasparov envisions,\ \understands a chessboard? (Douglas Hofstadter)" putStrLn $ (removePunctuation.replaceSpecialSymbols) quote
Replace all punctuation marks with an empty string, and...